Multi-Arm Experiment

Learning about effects of causes takes time, money, and effort. Multi-arm designs can make this process somewhat more efficient: multiple treatments are assigned in parallel and compared against the same control group, which reduces sample size and hence costs.

Suppose that we want to learn whether there is racial discrimination in the labor market by conducting an experiment. Companies from our population of size \(30\) are randomly assigned to receive a résumé from a white, black, or Latino candidate. Companies are assigned to one of these three conditions with equal probabilities, and résumés only vary race and are otherwise identical. We define our outcome of interest as the difference in callbacks between experimental conditions.

In settings of multiple treatment arms, we could do a number of pairwise comparisons: across treatments and each treatment against control.

Design Declaration

  • Model:

    We specify a population of size \(N\) where a unit \(i\) has a potential outcome, \(Y_i(Z = 0)\), when it remains untreated and \(M\) \((m = 1, 2, ..., M)\) potential outcomes defined according to the treatment that it receives. The effect of each treatment on the outcome of unit \(i\) is equal to the difference in the potential outcome under treatment condition \(m\) and the control condition: \(Y_i(Z = m) -Y_i(Z = 0)\).

  • Inquiry:

    We are interested in all of the pairwise comparisons between arms: \(E[Y(m) - Y(m')]\), for all \(m \in \{1,...,M\}\).

  • Data strategy:

    We randomly assign \(k/N\) units to each of the treatment arms.

  • Answer strategy:

    We take every pairwise difference in means corresponding to the specific estimand.

N <- 30
outcome_means <- c(0.5, 1, 2, 0.5)
sd_i <- 1
outcome_sds <- c(0, 0, 0, 0)

population <- declare_population(N = N, u_1 = rnorm(N, 0, outcome_sds[1L]), 
    u_2 = rnorm(N, 0, outcome_sds[2L]), u_3 = rnorm(N, 0, outcome_sds[3L]), 
    u_4 = rnorm(N, 0, outcome_sds[4L]), u = rnorm(N) * sd_i)
potential_outcomes <- declare_potential_outcomes(formula = Y ~ (outcome_means[1] + 
    u_1) * (Z == "1") + (outcome_means[2] + u_2) * (Z == "2") + 
    (outcome_means[3] + u_3) * (Z == "3") + (outcome_means[4] + 
    u_4) * (Z == "4") + u, conditions = c("1", "2", "3", "4"), 
    assignment_variables = Z)
estimand <- declare_estimands(ate_Y_2_1 = mean(Y_Z_2 - Y_Z_1), ate_Y_3_1 = mean(Y_Z_3 - 
    Y_Z_1), ate_Y_4_1 = mean(Y_Z_4 - Y_Z_1), ate_Y_3_2 = mean(Y_Z_3 - 
    Y_Z_2), ate_Y_4_2 = mean(Y_Z_4 - Y_Z_2), ate_Y_4_3 = mean(Y_Z_4 - 
    Y_Z_3))
assignment <- declare_assignment(num_arms = 4, conditions = c("1", "2", "3", 
"4"), assignment_variable = Z)
reveal_Y <- declare_reveal(assignment_variables = Z)
estimator <- declare_estimator(handler = function(data) {
    estimates <- rbind.data.frame(ate_Y_2_1 = difference_in_means(formula = Y ~ 
        Z, data = data, condition1 = "1", condition2 = "2"), 
        ate_Y_3_1 = difference_in_means(formula = Y ~ Z, data = data, 
            condition1 = "1", condition2 = "3"), ate_Y_4_1 = difference_in_means(formula = Y ~ 
            Z, data = data, condition1 = "1", condition2 = "4"), 
        ate_Y_3_2 = difference_in_means(formula = Y ~ Z, data = data, 
            condition1 = "2", condition2 = "3"), ate_Y_4_2 = difference_in_means(formula = Y ~ 
            Z, data = data, condition1 = "2", condition2 = "4"), 
        ate_Y_4_3 = difference_in_means(formula = Y ~ Z, data = data, 
            condition1 = "3", condition2 = "4"))
    names(estimates)[names(estimates) == "N"] <- "N_DIM"
    estimates$estimator_label <- c("DIM (Z_2 - Z_1)", "DIM (Z_3 - Z_1)", 
    "DIM (Z_4 - Z_1)", "DIM (Z_3 - Z_2)", "DIM (Z_4 - Z_2)", 
    "DIM (Z_4 - Z_3)")
    estimates$estimand_label <- rownames(estimates)
    estimates$estimate <- estimates$coefficients
    estimates$term <- NULL
    return(estimates)
})
multi_arm_design <- population + potential_outcomes + assignment + 
    reveal_Y + estimand + estimator

Takeaways

diagnosis <- diagnose_design(multi_arm_design)
Bias Power Coverage Mean Estimate Mean Estimand
-0.01 0.14 0.95 0.49 0.50
(0.02) (0.02) (0.01) (0.02) (0.00)
0.01 0.76 0.97 1.51 1.50
(0.02) (0.02) (0.01) (0.02) (0.00)
0.02 0.41 0.94 1.02 1.00
(0.02) (0.02) (0.01) (0.02) (0.00)
0.04 0.05 0.95 0.04 0.00
(0.02) (0.01) (0.01) (0.02) (0.00)
0.05 0.12 0.95 -0.45 -0.50
(0.02) (0.02) (0.01) (0.02) (0.00)
0.03 0.74 0.96 -1.47 -1.50
(0.03) (0.02) (0.01) (0.03) (0.00)
  • The pairwise differences in means provide unbiased estimates of the average treatment effect (ATE) in each arm.

  • These estimates, however, are not so precise. The estimated standard deviation is large yielding wide confidence intervals that contain the true value of the ATEs more than 95% of the time.