Selection of survey sites maximizing uniformity in environmental space
Source:R/uniformE_selection.R
uniformE_selection.Rd
Selection of sites to be sampled in a survey, with the goal of maximizing uniformity of points in environmental space.
Usage
uniformE_selection(master, variable_1 = NULL, variable_2 = NULL,
selection_from = "all_points", expected_points,
guess_distances = TRUE, initial_distance = NULL,
increase = NULL, max_n_samplings = 1,
replicates = 10, use_preselected_sites = TRUE,
median_distance_filter = NULL, set_seed = 1,
verbose = TRUE, force = FALSE)
Arguments
- master
master_matrix object derived from function
prepare_master_matrix
or master_selection object derived from functionsrandom_selection
,uniformG_selection
, orEG_selection
.- variable_1
(character or numeric) name or position of the first variable (x-axis). If the function
make_blocks
was used in a previous step, the default, NULL, will use the same two variables, otherwise this argument must be defined.- variable_2
(character or numeric) name or position of the second variable (y-axis). If the function
make_blocks
was used in a previous step, the default, NULL, will use the same two variables, otherwise this argument must be defined.- selection_from
(character) set of points to perform the selection from. Two options are available, "all_points" or "block_centroids". The first option picks the points from all points in the environmental cloud, and the second one selects points only from centroids of environmental blocks. See
make_blocks
. Default = "all_points".- expected_points
(numeric) total number of survey points (sites) to be selected.
- guess_distances
(logical) whether or not to use internal algorithm to automatically select
initial_distance
andincrease
. Default = TRUE. If FALSE,initial_distance
andincrease
must be defined.- initial_distance
(numeric) Euclidean distance to be used for a first process of thinning and detection of remaining points. Default = NULL.
- increase
(numeric) initial value to be added to or subtracted from
initial_distance
until reaching the number ofexpected_points
. Default = NULL.- max_n_samplings
(numeric) maximum number of samples to be chosen after performing all thinning
replicates
. Default = 1.- replicates
(numeric) number of thinning replicates. Default = 10.
- use_preselected_sites
(logical) whether to use sites that have been defined as part of the selected sites previous any selection. Object in
master
must contain the site(s) preselected in and element of name "preselected_sites" for this argument to be effective. Default = TRUE. See details for more information on the approach used.- median_distance_filter
(character) optional argument to define a median distance-based filter based on which sets of sampling sites will be selected. The default, NULL, does not apply such a filter. Options are: "max" and "min".
- set_seed
(numeric) integer value to specify a initial seed. Default = 1.
- verbose
(logical) whether or not to print messages about the process. Default = TRUE.
- force
(logical) whether to replace existing set of sites selected with this method in
master
.
Value
A master_selection
object (S3) with an element called
selected_sites_E containing one or more sets of selected sites.
Details
Survey sites are selected in ways in which points will be uniformly dispersed in environmental space, helping to select sites that present different environmental conditions across the area of interest. This type of selection is very useful to include, in the selected sites, distinct environmental combinations existent in the area of interest. However, as the distribution of climatic or other environmental combinations is not uniform in geography, the sites selected with this function could appear clustered when looked in a map.
Exploring the geographic and environmental spaces of the region of interest
would be a crucial first step before selecting survey sites. Such
explorations can be done using the function explore_data_EG
.
If use_preselected_sites
= TRUE and such sites are included as an
element in the object in master
, the approach for selecting uniform
sites in environmental space is different than what was described above.
User-preselected sites will always be part of the sites selected. Other
points are selected based on an algorithm that searches for sites that are
uniformly distributed in environmental space but at a distance from
preselected sites that helps in maintaining uniformity. Note that
preselected sites will not be processed; therefore, uniformity of such points
cannot be warrantied. As multiple sets could result from selection, the
argument of the function median_distance_filter
could be used to
select the set of sites with the maximum ("max") or minimum ("min") median
distance among selected sites. Option "max" will increase the geographic
distance among sampling sites, which could be desirable if the goal is to
cover the region of interest more broadly. The other option, "min", could be
used in cases when the goal is to reduce resources and time needed to sample
such sites.
Examples
# Data
m_matrix <- read_master(system.file("extdata/m_matrix.rds",
package = "biosurvey"))
# Making blocks for analysis
m_blocks <- make_blocks(m_matrix, variable_1 = "PC1",
variable_2 = "PC2", n_cols = 10, n_rows = 10,
block_type = "equal_area")
# Checking column names
colnames(m_blocks$data_matrix)
#> [1] "Longitude" "Latitude" "Mean_temperature"
#> [4] "Max_temperature" "Min_temperature" "Annual_precipitation"
#> [7] "Prec_wettest_month" "Prec_driest_month" "PC1"
#> [10] "PC2" "Block"
# Selecting sites uniformly in E space
# because the make_blocks function was used, the same two variables will be
# used by default.
selectionE <- uniformE_selection(m_blocks, selection_from = "block_centroids",
expected_points = 15, max_n_samplings = 1,
replicates = 5, set_seed = 1)
#> Element 'preselected_sites' in 'master' is NULL, setting
#> 'use_preselected_sites' = FALSE
#> Running algorithm for selecting sites, please wait...
#> Distance 0.93 resulted in 33 points
#> Distance 1.023 resulted in 29 points
#> Distance 1.116 resulted in 24 points
#> Distance 1.209 resulted in 21 points
#> Distance 1.302 resulted in 18 points
#> Distance 1.395 resulted in 15 points
#> Total number of sites selected: 15