Often we want to learn about the causal effect of one thing on something else – say, the effect of taking a pill on a person becoming healthy. We can define the effect of a cause on an outcome as the difference of how the outcomes potentially would have looked under two states of the world: the potential outcome in a world where the cause is present and the potential outcome in a world where it is absent. Were we able to observe that a person was sick in a world where they took the pill and sick in a world where they didn’t, for example, we would infer that the pill did not have any causal effect on their health.
The problem is we can never observe our world in two counterfactual states at once, and so we can never estimate the effect of a cause on any particular individual person or thing. We can, however, estimate an average of individual causal effects, which we refer to as the average treatment effect (ATE). The simplest design for inferring an ATE is the two-arm experiment: some people are assigned at random to a treatment (the first arm), and the rest are assigned to the control (the second arm). The difference in the averages of the two groups gives us an unbiased estimate of the true ATE.
How does this design work? One way to think about it is to compare the random assignment of the treatment to a random sampling procedure: in this design it is as though we take a representative sample from the two counterfactual states of the world. Seen in this way, the treatment and control groups are ‘random samples’ from the potential outcomes in a world where the cause is present and one in which it is absent. By taking the mean of the group that represents the treated potential outcomes and comparing it to the mean of the group that represents the untreated potential outcomes, we can construct a representative guess of the true average difference in the two states of the world. Just like in random sampling, as the size of our experiment grows our guess of the ATE will converge to the true ATE, implying that our design is unbiased. As we shall see, however, characterizing our uncertainty about our guesses can be complicated, even in such a simple design.
Our model of the world specifies a population of \(N\) units that have a control potential outcome, \(Y_i(Z = 0)\), that is distributed standard normally. A unit’s individual treatment effect is a random draw from a distribution with mean \(\tau\) and standard deviation \(\sigma\), that is added to its control potential outcome: \(Y_i(Z = 1) = Y_i(Z = 0) + t_i\). This implies that the variance of the sample’s treated potential outcomes is higher than the variance of their control potential outcomes, although they are correlated because the treated potential outcome is created by simply adding the treatment effect to the control potential outcome.
We want to know the average of all units’ differences in treated and untreated potential outcomes – the average treatment effect: \(E[Y_i(Z = 1) - Y_i(Z = 0)] = E[t_i] = \tau\).
We randomly sample \(n\) units from the population of \(N\). We randomly assign a fixed number, \(m\), to treatment, and the rest of the \(n-m\) units to control.
We subtract the mean of the control group from the mean of the treatment group in order to estimate the average treatment effect.
N <- 100 assignment_prob <- 0.5 control_mean <- 0 control_sd <- 1 treatment_mean <- 1 treatment_sd <- 1 rho <- 1 population <- declare_population(N = N, u_0 = rnorm(N), u_1 = rnorm(n = N, mean = rho * u_0, sd = sqrt(1 - rho^2))) potential_outcomes <- declare_potential_outcomes(Y ~ (1 - Z) * (u_0 * control_sd + control_mean) + Z * (u_1 * treatment_sd + treatment_mean)) estimand <- declare_estimand(ATE = mean(Y_Z_1 - Y_Z_0)) assignment <- declare_assignment(prob = assignment_prob) reveal_Y <- declare_reveal() estimator <- declare_estimator(Y ~ Z, estimand = estimand) two_arm_design <- population + potential_outcomes + estimand + assignment + reveal_Y + estimator
diagnosis <- diagnose_design(two_arm_design)
|Term||Bias||RMSE||Power||Coverage||Mean Estimate||SD Estimate||Mean Se||Type S Rate||Mean Estimand|
- As the diagnosis reveals, the estimate of the ATE is unbiased, but our coverage is above what it should be (95%). This reflects the fact that conventional estimators of the standard error for the difference in means make conservative assumptions about the true covariance in potential outcomes, and thus tend to produce confidence intervals that are too wide. In other words, while the estimate of the difference in outcomes is unbiased, our estimate of the variation in this difference is biased upwards.
Using the Simple Two Arm Designer
In R, you can generate a simple two arm design using the template function
two_arm_designer() in the
DesignLibrary package by running the following lines, which load the package:
We can then create specific designs by defining values for each argument. For example, we can create a design called
ate set to 500, .5, and .4, respectively, and other parameters use default values. To do so, we run the lines below.
two_arm_design <- two_arm_designer(N = 500, prob = .5, ate = .4)
You can see more details on the
two_arm_designer() function, its arguments, and default values, by running the following line of code: