fabricatr

Drawing discrete data based on probabilities or latent traits is a common task that can be cumbersome. Each function in our discrete drawing set creates a different type of discrete data: `draw_binary` creates binary 0/1 data, `draw_binomial` creates binomial data (repeated trial binary data), `draw_categorical` creates categorical data, `draw_ordered` transforms latent data into observed ordered categories, `draw_count` creates count data (poisson-distributed). `draw_likert` is an alias to `draw_ordered` that pre-specifies break labels and offers default breaks appropriate for a likert survey question.

```draw_binomial(prob = link(latent), trials = 1, N = length(prob),
latent = NULL, link = "identity", quantile_y = NULL)

draw_categorical(prob = link(latent), N = NULL, latent = NULL,
link = "identity", category_labels = NULL)

draw_ordered(x = link(latent), breaks = c(-1, 0, 1), break_labels = NULL,
N = length(x), latent = NULL, link = "identity", quantile_y = NULL)

draw_count(mean = link(latent), N = length(mean), latent = NULL,
link = "identity", quantile_y = NULL)

latent = NULL, quantile_y = NULL)

draw_likert(x, type = 7, breaks = NULL, N = length(x), latent = NULL,
link = "identity", quantile_y = NULL)

draw_quantile(type = NULL, N = NULL)```

Arguments

prob A number or vector of numbers representing the probability for binary or binomial outcomes; or a number, vector, or matrix of numbers representing probabilities for categorical outcomes. If you supply a link function, these underlying probabilities will be transformed. for `draw_binomial`, the number of trials for each observation number of units to draw. Defaults to the length of the vector of probabilities or latent data you provided. If the user provides a link argument other than identity, they should provide the variable `latent` rather than `prob` or `mean` link function between the latent variable and the probability of a postiive outcome, e.g. "logit", "probit", or "identity". For the "identity" link, the latent variable must be a probability. A vector of quantiles; if provided, rather than drawing stochastically from the distribution of interest, data will be drawn at exactly those quantiles. vector of labels for the categories produced by `draw_categorical`. If provided, must be equal to the number of categories provided in the `prob` argument. for `draw_ordered` or `draw_likert`, the latent data for each observation. vector of breaks to cut a latent outcome into ordered categories with `draw_ordered` or `draw_likert` vector of labels for the breaks to cut a latent outcome into ordered categories with `draw_ordered`. (Optional) for `draw_count`, the mean number of count units for each observation Type of Likert scale data for `draw_likert`. Valid options are 4, 5, and 7. Type corresponds to the number of categories in the Likert scale.

Value

A vector of data in accordance with the specification; generally numeric but for some functions, including `draw_ordered`, may be factor if break labels are provided.

Details

For variables with intra-cluster correlations, see `draw_binary_icc` and `draw_normal_icc`

Examples

```
# Drawing binary values (success or failure, treatment assignment)
fabricate(N = 3,
p = c(0, .5, 1),
binary = draw_binary(prob = p))#>   ID   p binary
#> 1  1 0.0      0
#> 2  2 0.5      0
#> 3  3 1.0      1
# Drawing binary values with probit link (transforming continuous data
# into a probability range).
fabricate(N = 3,
x = 10 * rnorm(N),
binary = draw_binary(latent = x, link = "probit"))#>   ID         x binary
#> 1  1 15.169750      1
#> 2  2  9.526467      1
#> 3  3 -5.088359      0
# Repeated trials: `draw_binomial`
fabricate(N = 3,
p = c(0, .5, 1),
binomial = draw_binomial(prob = p, trials = 10))#>   ID   p binomial
#> 1  1 0.0        0
#> 2  2 0.5        4
#> 3  3 1.0       10
# Ordered data: transforming latent data into observed, ordinal data.
# useful for survey responses.
fabricate(N = 3,
x = 5 * rnorm(N),
ordered = draw_ordered(x = x,
breaks = c(-Inf, -1, 1, Inf)))#>   ID         x ordered
#> 1  1 -4.470176       1
#> 2  2  3.509748       3
#> 3  3 -1.477712       1
# Providing break labels for latent data.
fabricate(N = 3,
x = 5 * rnorm(N),
ordered = draw_ordered(x = x,
breaks = c(-Inf, -1, 1, Inf),
break_labels = c("Not at all concerned",
"Somewhat concerned",
"Very concerned")))#>   ID        x        ordered
#> 1  1 1.755604 Very concerned
#> 2  2 5.182757 Very concerned
#> 3  3 1.420082 Very concerned
# Likert data: often used for survey data
fabricate(N = 10,
support_free_college = draw_likert(x = rnorm(N),
type = 5))#>    ID support_free_college
#> 1  01 Don't Know / Neutral
#> 2  02                Agree
#> 3  03       Strongly Agree
#> 4  04                Agree
#> 5  05                Agree
#> 6  06    Strongly Disagree
#> 7  07       Strongly Agree
#> 8  08             Disagree
#> 9  09                Agree
#> 10 10 Don't Know / Neutral
# Count data: useful for rates of occurrences over time.
fabricate(N = 5,
x = c(0, 5, 25, 50, 100),
theft_rate = draw_count(mean=x))#>   ID   x theft_rate
#> 1  1   0          0
#> 2  2   5          9
#> 3  3  25         25
#> 4  4  50         51
#> 5  5 100        102
# Categorical data: useful for demographic data.
fabricate(N = 6, p1 = runif(N), p2 = runif(N), p3 = runif(N),
cat = draw_categorical(cbind(p1, p2, p3)))#>   ID        p1        p2         p3 cat
#> 1  1 0.1604869 0.5617880 0.03923126   2
#> 2  2 0.2274149 0.6039721 0.62940734   3
#> 3  3 0.4889994 0.1923918 0.52326970   3
#> 4  4 0.2662559 0.3716696 0.39310540   1
#> 5  5 0.9868869 0.1200995 0.87789330   3
#> 6  6 0.6590275 0.3957093 0.28824992   2
```