Skip to contents

Declare a random sampling procedure.

Usage

declare_rs(
  N = NULL,
  strata = NULL,
  clusters = NULL,
  n = NULL,
  n_unit = NULL,
  prob = NULL,
  prob_unit = NULL,
  strata_n = NULL,
  strata_prob = NULL,
  simple = FALSE,
  check_inputs = TRUE
)

Arguments

N

The number of units. N must be a positive integer. (required)

strata

A vector of length N that indicates which stratum each unit belongs to.

clusters

A vector of length N that indicates which cluster each unit belongs to.

n

Use for a design in which n units (or clusters) are sampled. In a stratified design, exactly n units in each stratum will be sampled. (optional)

n_unit

Under complete random sampling, must be constant across units. Under stratified random sampling, must be constant within strata.

prob

Use for a design in which either floor(N*prob) or ceiling(N*prob) units (or clusters) are sampled. The probability of being sampled is exactly prob because with probability 1-prob, floor(N*prob) units (or clusters) will be sampled and with probability prob, ceiling(N*prob) units (or clusters) will be sampled. prob must be a real number between 0 and 1 inclusive. (optional)

prob_unit

Must of be of length N. Under simple random sampling, can be different for each unit or cluster. Under complete random sampling, must be constant across units. Under stratified random sampling, must be constant within strata.

strata_n

Use for a design in which strata_n describes the number of units to sample within each stratum.

strata_prob

Use for a design in which strata_prob describes the probability of being sampled within each stratum. Differs from prob in that the probability of being sampled can vary across strata.

simple

logical, defaults to FALSE. If TRUE, simple random sampling is used. When simple = TRUE, please do not specify n or strata_n. When simple = TRUE, prob may vary by unit.

check_inputs

logical. Defaults to TRUE.

Value

A list of class "declaration". The list has five entries: $rs_function, a function that generates random samplings according to the declaration. $rs_type, a string indicating the type of random sampling used $probabilities_vector, A vector length N indicating the probability of being sampled. $strata, the stratification variable. $clusters, the clustering variable.

Examples

# The declare_rs function is used in three ways:

# 1. To obtain some basic facts about a sampling procedure:
declaration <- declare_rs(N = 100, n = 30)
declaration
#> Random sampling procedure: Complete random sampling 
#> Number of units: 100 
#> The inclusion probabilities are constant across units.

# 2. To draw a random sample:

S <- draw_rs(declaration)
table(S)
#> S
#>  0  1 
#> 70 30 

# 3. To obtain inclusion probabilities

probs <- obtain_inclusion_probabilities(declaration)
table(probs, S)
#>      S
#> probs  0  1
#>   0.3 70 30

# Simple Random Sampling Declarations

declare_rs(N = 100, simple = TRUE)
#> Random sampling procedure: Simple random sampling 
#> Number of units: 100 
#> The inclusion probabilities are constant across units.
declare_rs(N = 100, prob = .4, simple = TRUE)
#> Random sampling procedure: Simple random sampling 
#> Number of units: 100 
#> The inclusion probabilities are constant across units.

# Complete Random Sampling Declarations

declare_rs(N = 100)
#> Random sampling procedure: Complete random sampling 
#> Number of units: 100 
#> The inclusion probabilities are constant across units.
declare_rs(N = 100, n = 30)
#> Random sampling procedure: Complete random sampling 
#> Number of units: 100 
#> The inclusion probabilities are constant across units.

# Stratified Random Sampling Declarations

strata <- rep(c("A", "B","C"), times=c(50, 100, 200))
declare_rs(strata = strata)
#> Random sampling procedure: Stratified random sampling 
#> Number of units: 350 
#> Number of strata: 3 
#> The inclusion probabilities are constant across units.
declare_rs(strata = strata, prob = .5)
#> Random sampling procedure: Stratified random sampling 
#> Number of units: 350 
#> Number of strata: 3 
#> The inclusion probabilities are constant across units.


# Cluster Random Sampling Declarations

clusters <- rep(letters, times = 1:26)
declare_rs(clusters = clusters)
#> Random sampling procedure: Cluster random sampling 
#> Number of units: 351 
#> Number of clusters: 26 
#> The inclusion probabilities are constant across units.
declare_rs(clusters = clusters, n = 10)
#> Random sampling procedure: Cluster random sampling 
#> Number of units: 351 
#> Number of clusters: 26 
#> The inclusion probabilities are constant across units.

# Stratified and Clustered Random Sampling Declarations

clusters <- rep(letters, times = 1:26)
strata <- rep(NA, length(clusters))
strata[clusters %in% letters[1:5]] <- "stratum_1"
strata[clusters %in% letters[6:10]] <- "stratum_2"
strata[clusters %in% letters[11:15]] <- "stratum_3"
strata[clusters %in% letters[16:20]] <- "stratum_4"
strata[clusters %in% letters[21:26]] <- "stratum_5"

table(strata, clusters)
#>            clusters
#> strata       a  b  c  d  e  f  g  h  i  j  k  l  m  n  o  p  q  r  s  t  u  v
#>   stratum_1  1  2  3  4  5  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
#>   stratum_2  0  0  0  0  0  6  7  8  9 10  0  0  0  0  0  0  0  0  0  0  0  0
#>   stratum_3  0  0  0  0  0  0  0  0  0  0 11 12 13 14 15  0  0  0  0  0  0  0
#>   stratum_4  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 16 17 18 19 20  0  0
#>   stratum_5  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 21 22
#>            clusters
#> strata       w  x  y  z
#>   stratum_1  0  0  0  0
#>   stratum_2  0  0  0  0
#>   stratum_3  0  0  0  0
#>   stratum_4  0  0  0  0
#>   stratum_5 23 24 25 26

declare_rs(clusters = clusters, strata = strata)
#> Random sampling procedure: Stratified and clustered random sampling 
#> Number of units: 351 
#> Number of strata: 5 
#> Number of clusters: 26 
#> The inclusion probabilities are constant across units.
declare_rs(clusters = clusters, strata = strata, prob = .3)
#> Random sampling procedure: Stratified and clustered random sampling 
#> Number of units: 351 
#> Number of strata: 5 
#> Number of clusters: 26 
#> The inclusion probabilities are constant across units.