Potential Outcomes

declare_potential_outcomes(...,
potential_outcomes_function = potential_outcomes_function_default)

Arguments

... Arguments to the potential_outcomes_function A function that accepts a data.frame as an argument and returns a data.frame with potential outcomes columns appended. See the examples for the behavior of the default function.

Value

a function that returns a data.frame

Details

A declare_potential_outcomes declaration returns a function. That function takes data and returns data with potential outcomes columns appended. These columns describe the outcomes that each unit would express if that unit were in the corresponding treatment condition.

The potential outcomes function can sometimes be a stumbling block for users, as some are uncomfortable asserting anything in particular about the very causal process that they are conducting a study to learn about! We recommend trying to imagine what your preferred theory would predict, what an alternative theory would predict, and what your study would reveal if there were no differences in potential outcomes for any unit (i.e., all treatment effects are zero).

Examples

my_population <-
declare_population(N = 1000,
income = rnorm(N),
age = sample(18:95, N, replace = TRUE))
pop <- my_population()

# By default, there are two ways of declaring potential outcomes:
# as separate variables or using a formula:

# As separate variables

my_potential_outcomes <-
declare_potential_outcomes(
Y_Z_0 = .05,
Y_Z_1 = .30 + .01 * age)

head(my_potential_outcomes(pop))#>     ID     income age Y_Z_0 Y_Z_1
#> 1 0001 -0.1478888  39  0.05  0.69
#> 2 0002 -0.5980440  92  0.05  1.22
#> 3 0003 -0.3238154  95  0.05  1.25
#> 4 0004 -0.1182236  80  0.05  1.10
#> 5 0005 -1.2907442  83  0.05  1.13
#> 6 0006 -0.2049537  21  0.05  0.51
# Using a formula
my_potential_outcomes <- declare_potential_outcomes(
formula = Y ~ .25 * Z + .01 * age * Z)
pop_pos <- my_potential_outcomes(pop)
head(pop_pos)#>     ID     income age Y_Z_0 Y_Z_1
#> 1 0001 -0.1478888  39     0  0.64
#> 2 0002 -0.5980440  92     0  1.17
#> 3 0003 -0.3238154  95     0  1.20
#> 4 0004 -0.1182236  80     0  1.05
#> 5 0005 -1.2907442  83     0  1.08
#> 6 0006 -0.2049537  21     0  0.46
# condition_names defines the "range" of the potential outcomes function
my_potential_outcomes <-
declare_potential_outcomes(
formula = Y ~ .25 * Z + .01 * age * Z,
condition_names = 1:4)

head(my_potential_outcomes(pop))#>     ID     income age Y_Z_1 Y_Z_2 Y_Z_3 Y_Z_4
#> 1 0001 -0.1478888  39  0.64  1.28  1.92  2.56
#> 2 0002 -0.5980440  92  1.17  2.34  3.51  4.68
#> 3 0003 -0.3238154  95  1.20  2.40  3.60  4.80
#> 4 0004 -0.1182236  80  1.05  2.10  3.15  4.20
#> 5 0005 -1.2907442  83  1.08  2.16  3.24  4.32
#> 6 0006 -0.2049537  21  0.46  0.92  1.38  1.84