Potential Outcomes

declare_potential_outcomes(..., handler = potential_outcomes_handler, label = NULL) potential_outcomes.formula(formula, data, conditions = c(0, 1), assignment_variable = "Z", level = NULL) potential_outcomes.default(formula = stop("Not provided"), ..., data, level = NULL)

... | Arguments to the potential_outcomes_function |
---|---|

handler | A function that accepts a data.frame as an argument and returns a data.frame with potential outcomes columns appended. See the examples for the behavior of the default function. |

label | A step label |

formula | a formula to calculate Potential outcomes as functions of assignment variables |

data | a data.frame |

conditions | vector specifying the values the assignment variable can realize |

assignment_variable | The name of the assignment variable |

level | a character specifying a level of hierarchy for fabricate to calculate at |

a function that returns a data.frame

A `declare_potential_outcomes` declaration returns a function. That function takes data and returns data with potential outcomes columns appended. These columns describe the outcomes that each unit would express if that unit were in the corresponding treatment condition.

The potential outcomes function can sometimes be a stumbling block for users, as some are uncomfortable asserting anything in particular about the very causal process that they are conducting a study to learn about! We recommend trying to imagine what your preferred theory would predict, what an alternative theory would predict, and what your study would reveal if there were no differences in potential outcomes for any unit (i.e., all treatment effects are zero).

my_population <- declare_population(N = 1000, income = rnorm(N), age = sample(18:95, N, replace = TRUE)) pop <- my_population() # By default, there are two ways of declaring potential outcomes: # as separate variables or using a formula: # As separate variables my_potential_outcomes <- declare_potential_outcomes( Y_Z_0 = .05, Y_Z_1 = .30 + .01 * age) head(my_potential_outcomes(pop))#> ID income age Y_Z_0 Y_Z_1 #> 1 0001 -0.4349032 30 0.05 0.60 #> 2 0002 0.4989009 85 0.05 1.15 #> 3 0003 1.1979224 65 0.05 0.95 #> 4 0004 -0.1707637 71 0.05 1.01 #> 5 0005 0.5745204 40 0.05 0.70 #> 6 0006 -0.5342050 44 0.05 0.74# Using a formula my_potential_outcomes <- declare_potential_outcomes( formula = Y ~ .25 * Z + .01 * age * Z) pop_pos <- my_potential_outcomes(pop) head(pop_pos)#> ID income age Y_Z_0 Y_Z_1 #> 1 0001 -0.4349032 30 0 0.55 #> 2 0002 0.4989009 85 0 1.10 #> 3 0003 1.1979224 65 0 0.90 #> 4 0004 -0.1707637 71 0 0.96 #> 5 0005 0.5745204 40 0 0.65 #> 6 0006 -0.5342050 44 0 0.69# conditions defines the "range" of the potential outcomes function my_potential_outcomes <- declare_potential_outcomes( formula = Y ~ .25 * Z + .01 * age * Z, conditions = 1:4) head(my_potential_outcomes(pop))#> ID income age Y_Z_1 Y_Z_2 Y_Z_3 Y_Z_4 #> 1 0001 -0.4349032 30 0.55 1.10 1.65 2.20 #> 2 0002 0.4989009 85 1.10 2.20 3.30 4.40 #> 3 0003 1.1979224 65 0.90 1.80 2.70 3.60 #> 4 0004 -0.1707637 71 0.96 1.92 2.88 3.84 #> 5 0005 0.5745204 40 0.65 1.30 1.95 2.60 #> 6 0006 -0.5342050 44 0.69 1.38 2.07 2.76