Skip to contents

Builds condition probability matrices for Horvitz-Thompson estimation from randomizr declaration

Usage

declaration_to_condition_pr_mat(
  ra_declaration,
  condition1 = NULL,
  condition2 = NULL,
  prob_matrix = NULL
)

Arguments

ra_declaration

An object of class "ra_declaration", generated by the declare_ra function in randomizr. This object contains the experimental design that will be represented in a condition probability matrix

condition1

The name of the first condition, often the control group. If NULL, defaults to first condition in randomizr declaration. Either both condition1 and condition2 have to be specified or both left as NULL.

condition2

The name of the second condition, often the treatment group. If NULL, defaults to second condition in randomizr declaration. Either both condition1 and condition2 have to be specified or both left as NULL.

prob_matrix

An optional probability matrix to override the one in ra_declaration

Value

a numeric 2n*2n matrix of marginal and joint condition treatment probabilities to be passed to the condition_pr_mat argument of horvitz_thompson. See details.

Details

This function takes a "ra_declaration", generated by the declare_ra function in randomizr and returns a 2n*2n matrix that can be used to fully specify the design for horvitz_thompson estimation. This is done by passing this matrix to the condition_pr_mat argument of horvitz_thompson.

Currently, this function can learn the condition probability matrix for a wide variety of randomizations: simple, complete, simple clustered, complete clustered, blocked, block-clustered.

A condition probability matrix is made up of four submatrices, each of which corresponds to the joint and marginal probability that each observation is in one of the two treatment conditions.

The upper-left quadrant is an n*n matrix. On the diagonal is the marginal probability of being in condition 1, often control, for every unit (Pr(Z_i = Condition1) where Z represents the vector of treatment conditions). The off-diagonal elements are the joint probabilities of each unit being in condition 1 with each other unit, Pr(Z_i = Condition1, Z_j = Condition1) where i indexes the rows and j indexes the columns.

The upper-right quadrant is also an n*n matrix. On the diagonal is the joint probability of a unit being in condition 1 and condition 2, often the treatment, and thus is always 0. The off-diagonal elements are the joint probability of unit i being in condition 1 and unit j being in condition 2, Pr(Z_i = Condition1, Z_j = Condition2).

The lower-left quadrant is also an n*n matrix. On the diagonal is the joint probability of a unit being in condition 1 and condition 2, and thus is always 0. The off-diagonal elements are the joint probability of unit i being in condition 2 and unit j being in condition 1, Pr(Z_i = Condition2, Z_j = Condition1).

The lower-right quadrant is an n*n matrix. On the diagonal is the marginal probability of being in condition 2, often treatment, for every unit (Pr(Z_i = Condition2)). The off-diagonal elements are the joint probability of each unit being in condition 2 together, Pr(Z_i = Condition2, Z_j = Condition2).

Examples


# Learn condition probability matrix from complete blocked design
library(randomizr)
n <- 100
dat <- data.frame(
  blocks = sample(letters[1:10], size = n, replace = TRUE),
  y = rnorm(n)
)

# Declare complete blocked randomization
bl_declaration <- declare_ra(blocks = dat$blocks, prob = 0.4, simple = FALSE)
# Get probabilities
block_pr_mat <- declaration_to_condition_pr_mat(bl_declaration, 0, 1)
# Do randomiztion
dat$z <- conduct_ra(bl_declaration)

horvitz_thompson(y ~ z, data = dat, condition_pr_mat = block_pr_mat)
#>     Estimate Std. Error    t value  Pr(>|t|)   CI Lower  CI Upper DF
#> z -0.1744343  0.2227431 -0.7831187 0.4335574 -0.6110028 0.2621342 NA

# When you pass a declaration to horvitz_thompson, this function is called

# Equivalent to above call
horvitz_thompson(y ~ z, data = dat, ra_declaration = bl_declaration)
#>     Estimate Std. Error    t value  Pr(>|t|)   CI Lower  CI Upper DF
#> z -0.1744343  0.2227431 -0.7831187 0.4335574 -0.6110028 0.2621342 NA