Skip to contents

Selection of sites to be sampled in a survey, with the goal of maximizing uniformity of points in environmental space.

Usage

uniformE_selection(master, variable_1 = NULL, variable_2 = NULL,
                   selection_from = "all_points", expected_points,
                   guess_distances = TRUE, initial_distance = NULL,
                   increase = NULL, max_n_samplings = 1,
                   replicates = 10, use_preselected_sites = TRUE,
                   median_distance_filter = NULL, set_seed = 1,
                   verbose = TRUE, force = FALSE)

Arguments

master

master_matrix object derived from function prepare_master_matrix or master_selection object derived from functions random_selection, uniformG_selection, or EG_selection.

variable_1

(character or numeric) name or position of the first variable (x-axis). If the function make_blocks was used in a previous step, the default, NULL, will use the same two variables, otherwise this argument must be defined.

variable_2

(character or numeric) name or position of the second variable (y-axis). If the function make_blocks was used in a previous step, the default, NULL, will use the same two variables, otherwise this argument must be defined.

selection_from

(character) set of points to perform the selection from. Two options are available, "all_points" or "block_centroids". The first option picks the points from all points in the environmental cloud, and the second one selects points only from centroids of environmental blocks. See make_blocks. Default = "all_points".

expected_points

(numeric) total number of survey points (sites) to be selected.

guess_distances

(logical) whether or not to use internal algorithm to automatically select initial_distance and increase. Default = TRUE. If FALSE, initial_distance and increase must be defined.

initial_distance

(numeric) Euclidean distance to be used for a first process of thinning and detection of remaining points. Default = NULL.

increase

(numeric) initial value to be added to or subtracted from initial_distance until reaching the number of expected_points. Default = NULL.

max_n_samplings

(numeric) maximum number of samples to be chosen after performing all thinning replicates. Default = 1.

replicates

(numeric) number of thinning replicates. Default = 10.

use_preselected_sites

(logical) whether to use sites that have been defined as part of the selected sites previous any selection. Object in master must contain the site(s) preselected in and element of name "preselected_sites" for this argument to be effective. Default = TRUE. See details for more information on the approach used.

median_distance_filter

(character) optional argument to define a median distance-based filter based on which sets of sampling sites will be selected. The default, NULL, does not apply such a filter. Options are: "max" and "min".

set_seed

(numeric) integer value to specify a initial seed. Default = 1.

verbose

(logical) whether or not to print messages about the process. Default = TRUE.

force

(logical) whether to replace existing set of sites selected with this method in master.

Value

A master_selection object (S3) with an element called selected_sites_E containing one or more sets of selected sites.

Details

Survey sites are selected in ways in which points will be uniformly dispersed in environmental space, helping to select sites that present different environmental conditions across the area of interest. This type of selection is very useful to include, in the selected sites, distinct environmental combinations existent in the area of interest. However, as the distribution of climatic or other environmental combinations is not uniform in geography, the sites selected with this function could appear clustered when looked in a map.

Exploring the geographic and environmental spaces of the region of interest would be a crucial first step before selecting survey sites. Such explorations can be done using the function explore_data_EG.

If use_preselected_sites = TRUE and such sites are included as an element in the object in master, the approach for selecting uniform sites in environmental space is different than what was described above. User-preselected sites will always be part of the sites selected. Other points are selected based on an algorithm that searches for sites that are uniformly distributed in environmental space but at a distance from preselected sites that helps in maintaining uniformity. Note that preselected sites will not be processed; therefore, uniformity of such points cannot be warrantied. As multiple sets could result from selection, the argument of the function median_distance_filter could be used to select the set of sites with the maximum ("max") or minimum ("min") median distance among selected sites. Option "max" will increase the geographic distance among sampling sites, which could be desirable if the goal is to cover the region of interest more broadly. The other option, "min", could be used in cases when the goal is to reduce resources and time needed to sample such sites.

Examples

# Data
m_matrix <- read_master(system.file("extdata/m_matrix.rds",
                                    package = "biosurvey"))

# Making blocks for analysis
m_blocks <- make_blocks(m_matrix, variable_1 = "PC1",
                        variable_2 = "PC2", n_cols = 10, n_rows = 10,
                        block_type = "equal_area")

# Checking column names
colnames(m_blocks$data_matrix)
#>  [1] "Longitude"            "Latitude"             "Mean_temperature"    
#>  [4] "Max_temperature"      "Min_temperature"      "Annual_precipitation"
#>  [7] "Prec_wettest_month"   "Prec_driest_month"    "PC1"                 
#> [10] "PC2"                  "Block"               

# Selecting sites uniformly in E space
# because the make_blocks function was used, the same two variables will be
# used by default.
selectionE <- uniformE_selection(m_blocks, selection_from = "block_centroids",
                                 expected_points = 15, max_n_samplings = 1,
                                 replicates = 5, set_seed = 1)
#> Element 'preselected_sites' in 'master' is NULL, setting
#> 'use_preselected_sites' = FALSE
#> Running algorithm for selecting sites, please wait...
#>     Distance  0.93  resulted in  33  points
#>     Distance  1.023  resulted in  29  points
#>     Distance  1.116  resulted in  24  points
#>     Distance  1.209  resulted in  21  points
#>     Distance  1.302  resulted in  18  points
#>     Distance  1.395  resulted in  15  points
#> Total number of sites selected: 15