Declare Sampling Procedure

declare_sampling(..., sampling_function = sampling_function_default)

Arguments

...

Arguments to the sampling function

sampling_function

A function that takes a data.frame, subsets to sampled observations and optionally adds sampling probabilities or other relevant quantities, and returns a data.frame. By default, the sampling_function uses the randomizr functions draw_rs and obtain_inclusion_probabilities to conduct random sampling and obtain the probability of inclusion in the sample.

Value

a function that takes a data.frame as an argument and returns a data.frame subsetted to sampled observations and (optionally) augmented with inclusion probabilities and other quantities.

Details

While declare_sampling can work with any sampling_function that takes data and returns data, most random sampling procedures can be easily implemented with randomizr. The arguments to draw_rs can include N, strata_var, clust_var, n, prob, strata_n, and strata_prob. The arguments you need to specify are different for different designs. Check the help files for complete_rs, strata_rs, cluster_rs, or strata_and_cluster_rs for details on how to execute many common designs.

Examples

my_population <- declare_population(N = 100, female = rbinom(N, 1, .5)) df <- my_population() # Simple random sampling using randomizr # use any arguments you would use in draw_rs. my_sampling <- declare_sampling(n = 50) df <- my_sampling(df) dim(df)
#> [1] 50 3
head(df)
#> ID female S_inclusion_prob #> 2 002 1 0.5 #> 4 004 0 0.5 #> 7 007 1 0.5 #> 9 009 0 0.5 #> 11 011 1 0.5 #> 12 012 0 0.5
# Stratified random sampling my_stratified_sampling <- declare_sampling(strata_var = female) df <- my_population() table(df$female)
#> #> 0 1 #> 54 46
df <- my_stratified_sampling(df) table(df$female)
#> #> 0 1 #> 27 23
# Custom random sampling functions df <- my_population() my_sampling_function <- function(data) { data$S <- rbinom(n = nrow(data), size = 1, prob = 0.5) data[data$S == 1, ] } my_sampling_custom <- declare_sampling( sampling_function = my_sampling_function) df <- my_sampling_custom(df) dim(df)
#> [1] 46 3
head(df)
#> ID female S #> 4 004 0 1 #> 6 006 1 1 #> 7 007 1 1 #> 9 009 1 1 #> 10 010 1 1 #> 14 014 0 1