Simple Spillover Design

An advantage of experimental designs is that fewer assumptions are needed to justify causal claims. However some assumptions are still needed. Perhaps the most important of these is the assumption that there are no spillovers (technically this is assumption is part of what is called the Stable Unit Treatment Value Assumption, or “SUTVA”).

The spillover_designer generates a simple design with a data generating process that suffers from spillovers.

The presence of spillovers creates a challenge for the definition of the estimand. Indeed the assumption of no spillovers runs so deep that it is often invoked even prior to the definition of estimands. If you write the “average treatment effect” estimand as \(E(Y(1) - Y(0))\) you are assuming already that the potential outcomes depend only on a unit’s own assignment to treatment.

If there are spillovers however then another estimand is needed. It turns out however that in this case there may be an explosion is the set of possible estimands of interest.

For instance, we can define estimands such as the difference for a unit when it—and only it—is treated, compared to a situation in which no unit is treated. In potential outcomes notation that could be written, for unit 3, say, as:

\[\tau_3 = Y(0,0,1,0,0,\dots) - Y(0,0,0,0,0,\dots)\] One could then define a population estimand that is the average of all these differences over a population. Note that these differences specify different counterfactual assignment vectors for each each individual. In the absence of spillovers this would be the same as the usual average treatment effect; but in the presence of spillovers this estimand is well defined whereas the usual average treatment effect estimand is not.

But there are many other possibilities. For example the difference in outcomes when none are treated and all are treated. Or the difference between not being treated when all others are, and all being treated. Or the difference a change in your own condition would make given that others are assigned to the value that they actually have.

Below we describe a design that allows for the possibility of spillovers like this. Diagnosing the design shows how severe the problem of spillovers can be for estimation. Modifying this design lets you explore different types of solutions.

Design Declaration

We declare this design like this:

  • Model:

    Our model of the world specifies a population of units each of which is a member of a group (n_groups of size group_size each). The key assumption of this spillover design is that if any member of group, is treated then all units receive some equal benefit (with marginal gains possibly increasing or decreasing in the numbers treated). Specifically, letting \(G[i]\) denote the group of which \(i\) is a member, we assume:

\[Y_i = \left(\frac{1}{n_{G[i]}} \sum_{j\in G[i]}Z_j \right)^\gamma + \epsilon_i\]

  • Inquiry:

    The estimand of interest is the average, across all individuals, of the difference between the situation in which that individual only is treated and one in which no individuals are treated.

  • Data strategy:

    We randomly sample half the units to treatment and half to control, not taking account of group membership. .

  • Answer strategy:

    We subtract the mean of the control group from the mean of the treatment group in order to estimate the average treatment effect.

In code:

N_groups <- 80
N_i_group <- 3
sd_i <- 0.2
gamma <- 2

population <- declare_population(G = add_level(N = N_groups, 
    n = N_i_group), i = add_level(N = n, zeros = 0, ones = 1))
dgp <- function(i, Z, G, n) (sum(Z[G == G[i]])/n[i])^gamma + 
    rnorm(1) * sd_i
estimand <- declare_estimand(Treat_1 = mean(sapply(1:length(G), 
    function(i) {
        Z_i <- (1:length(G)) == i
        dgp(i, Z_i, G, n) - dgp(i, zeros, G, n)
    })), label = "estimand")
assignment <- declare_assignment()
reveal_Y <- declare_reveal(handler = fabricate, Y = sapply(1:N, 
    function(i) dgp(i, Z, G, n)))
estimator <- declare_estimator(Y ~ Z, estimand = estimand, 
    model = lm_robust, label = "naive")
spillover_design <- population + estimand + assignment + 
    reveal_Y + estimator

Diagnosis

The diagnosis of this design is below.

diagnosis <- diagnose_design(spillover_design)
Term Bias RMSE Power Coverage Mean Estimate SD Estimate Mean Se Type S Rate Mean Estimand
Z 0.22 0.23 1.00 0.01 0.33 0.05 0.04 0.00 0.11
(0.00) (0.00) (0.00) (0.00) (0.00) (0.00) (0.00) (0.00) (0.00)

The simplest point to note is that the estimates of effects are biased. This renders the excellent statistical power of the design somewhat meaningless.

The sign of the bias term is perhaps more surprising. The idea that positive spillovers produce an underestimate of effects is a familiar one. An interesting feature of the structure of spillovers in this design is that positive spillovers can produce an overestimate of treatment effects when spillover effects are increasing in the number of neighbors treated. The reason is that if a treated individual is more likely to be in a group with multiple treated individuals than is an untreated individual and so outcomes for treated units are enhanced by within group spillovers

Using the Spillover Designer

In R, you can generate a spillover design using the template function spillover_designer() in the DesignLibrary package by running the following lines, which load the package:

library(DesignLibrary)

We can then create specific designs by defining values for each argument. For example, we create a design called my_spillover_design with n_groups, group_size, sd, and gamma set to 40, 10, .25, and 4, respectively, by running the lines below.

spillover_design <- spillover_designer(n_groups = 40, 
                                                     group_size = 10,
                                                     sd = .25,
                                                     gamma = 4)

You can see more details on the spillover_designer() function and its arguments by running the following line of code:

??spillover_designer