A \(2^K\) factorial design is an extension of the `two_by_two_designer`

that allows for \(k\) number of factors (or treatments), each with two levels, resulting in \(2^K\) combinations of treatment conditions. You might be interested in testing not just how two, but rather how three, or how any number (\(k\)) of treatment conditions interact. The interest in the effect of different treatments in the presence or absence of other treatments is a key distinction from the `multi_arm_designer`

, in which we evaluate the effect of treatments where there is no overlap in the assignment of different conditions.

Let’s say you set out to study how incentives and managerial practices affect employee productivity *in combination*. Let’s imagine that intervention \(A\) is a salary increase conditional on performance indicators, intervention \(B\) is increased monitoring, and intervention \(C\) is a motivational activity. A key question might be the marginal effect of conditional salary increases on productivity in the presence of greater monitoring or in the presence of motivational interventions, or both.

## Design Declaration

**M**odel:We stipulate \(2^k\) potential outcomes. In our example, with treatments \(\{A,B,C\} \in \{0,1\}\), we have \(Y^{ABC}\). We can denote the potential outcome of a unit who is in control for all treatments \(Y^{000}_i\), for example, and that for a unit who is assigned to B and C only as \(Y^{011}\). Each potential outcome is distributed normally with mean \(\mu_j\) and standard deviation \(\sigma_j\), where \(j \in \{1,...,2^k\}\).

**I**nquiry:Estimands are defined for each combination of treatment assignment as weighted averages of differences in potential outcomes. The effect of A in this design, \(\tau_A\), averaged over conditions defined by B and C, is given by: \(\tau_A = 1/4*(Y^{111} - Y^{011}) + 1/4*(Y^{101} - Y^{001}) + 1/4*(Y^{110} - Y^{010}) + 1/4*(Y^{100} - Y^{000})\).

**D**ata strategy:We randomly assign units to one of the \(2^k\) combinations of treatment using equal probabilities of assignment.

**A**nswer strategy:Estimates draw from a regression in which all treatments are demeaned and weighted by the inverse probability of being in the condition they are in. In this demeaned regression, the constant captures the average outcome across all conditions — not the outcome when all units are in the control condition.

```
N <- 256
outcome_means <- c(0, 0, 0, 0, 0, 0, 0, 0)
outcome_sds <- c(1, 1, 1, 1, 1, 1, 1, 1)
population <- declare_population(N)
potential_outcomes <- declare_potential_outcomes(Y_0_0_0 = outcome_means[1L] + rnorm(N,
0, outcome_sds[1L]), Y_1_0_0 = outcome_means[2L] + rnorm(N,
0, outcome_sds[2L]), Y_0_1_0 = outcome_means[3L] + rnorm(N,
0, outcome_sds[3L]), Y_1_1_0 = outcome_means[4L] + rnorm(N,
0, outcome_sds[4L]), Y_0_0_1 = outcome_means[5L] + rnorm(N,
0, outcome_sds[5L]), Y_1_0_1 = outcome_means[6L] + rnorm(N,
0, outcome_sds[6L]), Y_0_1_1 = outcome_means[7L] + rnorm(N,
0, outcome_sds[7L]), Y_1_1_1 = outcome_means[8L] + rnorm(N,
0, outcome_sds[8L]))
reveal_Y <- declare_reveal(handler = function(data) {
potential_cols <- mapply(paste, data[, c("T1", "T2", "T3"
), drop = FALSE], sep = "_", SIMPLIFY = FALSE)
potential_cols <- do.call(paste, c("Y", potential_cols, sep = "_"))
upoc <- unique(potential_cols)
df <- data[, upoc, drop = FALSE]
R <- seq_len(nrow(df))
C <- match(potential_cols, colnames(df))
data[, "Y"] <- df[cbind(R, C)]
data
})
estimand <- declare_estimand(Overall_average = mean(0.125 * Y_0_0_0 + 0.125 *
Y_0_0_1 + 0.125 * Y_0_1_0 + 0.125 * Y_0_1_1 + 0.125 * Y_1_0_0 +
0.125 * Y_1_0_1 + 0.125 * Y_1_1_0 + 0.125 * Y_1_1_1), te_T3 = mean(-0.25 *
Y_0_0_0 + 0.25 * Y_0_0_1 + -0.25 * Y_0_1_0 + 0.25 * Y_0_1_1 +
-0.25 * Y_1_0_0 + 0.25 * Y_1_0_1 + -0.25 * Y_1_1_0 + 0.25 *
Y_1_1_1), te_T2 = mean(-0.25 * Y_0_0_0 + -0.25 * Y_0_0_1 +
0.25 * Y_0_1_0 + 0.25 * Y_0_1_1 + -0.25 * Y_1_0_0 + -0.25 *
Y_1_0_1 + 0.25 * Y_1_1_0 + 0.25 * Y_1_1_1), `te_T2:T3` = mean(0.5 *
Y_0_0_0 + -0.5 * Y_0_0_1 + -0.5 * Y_0_1_0 + 0.5 * Y_0_1_1 +
0.5 * Y_1_0_0 + -0.5 * Y_1_0_1 + -0.5 * Y_1_1_0 + 0.5 * Y_1_1_1),
te_T1 = mean(-0.25 * Y_0_0_0 + -0.25 * Y_0_0_1 + -0.25 *
Y_0_1_0 + -0.25 * Y_0_1_1 + 0.25 * Y_1_0_0 + 0.25 * Y_1_0_1 +
0.25 * Y_1_1_0 + 0.25 * Y_1_1_1), `te_T1:T3` = mean(0.5 *
Y_0_0_0 + -0.5 * Y_0_0_1 + 0.5 * Y_0_1_0 + -0.5 * Y_0_1_1 +
-0.5 * Y_1_0_0 + 0.5 * Y_1_0_1 + -0.5 * Y_1_1_0 + 0.5 *
Y_1_1_1), `te_T1:T2` = mean(0.5 * Y_0_0_0 + 0.5 * Y_0_0_1 +
-0.5 * Y_0_1_0 + -0.5 * Y_0_1_1 + -0.5 * Y_1_0_0 + -0.5 *
Y_1_0_1 + 0.5 * Y_1_1_0 + 0.5 * Y_1_1_1), `te_T1:T2:T3` = mean(-1 *
Y_0_0_0 + 1 * Y_0_0_1 + 1 * Y_0_1_0 + -1 * Y_0_1_1 +
1 * Y_1_0_0 + -1 * Y_1_0_1 + -1 * Y_1_1_0 + 1 * Y_1_1_1),
label = "ATE")
assignment_factors <- declare_assignment(conditions = 1:(2^3), prob_each = c(0.125,
0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125))
assignment <- declare_step(fabricate, T1 = as.numeric(Z %in% c(2L, 4L, 6L,
8L)), T2 = as.numeric(Z %in% c(3L, 4L, 7L, 8L)), T3 = as.numeric(Z %in%
5:8))
estimator <- declare_estimator(handler = tidy_estimator(function(data) {
data[, names(data) %in% c("T1", "T2", "T3")] <- data[, names(data) %in%
c("T1", "T2", "T3")] - 0.5
mod <- lm_robust(formula = Y ~ T1 * T2 * T3, data = data,
weights = 1/Z_cond_prob)
estimate_df <- tidy(mod)
estimate_df$estimand_label <- paste0("te_", estimate_df$term)
estimate_df$estimand_label[estimate_df$estimand_label ==
"te_(Intercept)"] <- "Overall_average"
estimate_df
}))
factorial_design <- population + potential_outcomes + assignment_factors +
assignment + reveal_Y + estimand + estimator
```

# Takeaways

`diagnosis <- diagnose_design(factorial_design)`

Design Label | Estimand Label | Estimator Label | Term | N Sims | Bias | RMSE | Power | Coverage | Mean Estimate | SD Estimate | Mean Se | Type S Rate | Mean Estimand |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

factorial_design | Overall_average | estimator | (Intercept) | 500 | 0.00 | 0.06 | 0.06 | 0.95 | 0.00 | 0.06 | 0.06 | 0.14 | -0.00 |

(0.00) | (0.00) | (0.01) | (0.01) | (0.00) | (0.00) | (0.00) | (0.07) | (0.00) | |||||

factorial_design | te_T1 | estimator | T1 | 500 | 0.00 | 0.11 | 0.04 | 0.97 | 0.01 | 0.12 | 0.13 | 0.14 | 0.00 |

(0.01) | (0.00) | (0.01) | (0.01) | (0.01) | (0.00) | (0.00) | (0.09) | (0.00) | |||||

factorial_design | te_T1:T2 | estimator | T1:T2 | 500 | -0.00 | 0.24 | 0.05 | 0.97 | -0.00 | 0.25 | 0.25 | 0.15 | 0.00 |

(0.01) | (0.01) | (0.01) | (0.01) | (0.01) | (0.01) | (0.00) | (0.07) | (0.00) | |||||

factorial_design | te_T1:T2:T3 | estimator | T1:T2:T3 | 500 | 0.01 | 0.46 | 0.05 | 0.97 | 0.00 | 0.48 | 0.50 | 0.09 | -0.01 |

(0.02) | (0.01) | (0.01) | (0.01) | (0.02) | (0.01) | (0.00) | (0.06) | (0.01) | |||||

factorial_design | te_T1:T3 | estimator | T1:T3 | 500 | 0.00 | 0.23 | 0.04 | 0.96 | -0.01 | 0.25 | 0.25 | 0.14 | -0.01 |

(0.01) | (0.01) | (0.01) | (0.01) | (0.01) | (0.01) | (0.00) | (0.08) | (0.00) | |||||

factorial_design | te_T2 | estimator | T2 | 500 | -0.00 | 0.11 | 0.04 | 0.97 | -0.00 | 0.12 | 0.13 | 0.20 | -0.00 |

(0.01) | (0.00) | (0.01) | (0.01) | (0.01) | (0.00) | (0.00) | (0.10) | (0.00) | |||||

factorial_design | te_T2:T3 | estimator | T2:T3 | 500 | -0.01 | 0.24 | 0.05 | 0.97 | -0.02 | 0.26 | 0.25 | 0.26 | -0.01 |

(0.01) | (0.01) | (0.01) | (0.01) | (0.01) | (0.01) | (0.00) | (0.10) | (0.00) | |||||

factorial_design | te_T3 | estimator | T3 | 500 | 0.00 | 0.11 | 0.03 | 0.97 | -0.00 | 0.12 | 0.13 | 0.07 | -0.00 |

(0.01) | (0.00) | (0.01) | (0.01) | (0.01) | (0.00) | (0.00) | (0.07) | (0.00) |

- We see that the space of estimands we can estimate is amazingly rich, but our power to do so is very low.