../vignettes/absorbing-fixed-effects.Rmd
absorbing-fixed-effects.Rmd
Whether analyzing a block-randomized experiment or adding fixed effects for a panel model, absorbing group means can speed up estimation time. The fixed_effects
argument in both lm_robust
and iv_robust
allows you to do just that, although the speed gains are greatest with “HC1” standard errors. Specifying fixed effects is really simple.
## Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
## hp -0.02403883 0.01503818 -1.598521 0.1211523 -0.05484314 0.006765475 28
lmr_out$fixed_effects
## cyl4 cyl6 cyl8
## 28.65012 22.68246 20.12927
Before proceeding, three quick notes:
fixed_effects = ~ year + country
, please ensure that your model is well-specified if you do so. If there are dependencies or overlapping groups across multiple sets of fixed effects, we cannot guarantee the correct degrees of freedom.In general, our speed gains will be greatest as the number of groups/fixed effects is large relative to the number of observations. Imagine we have 300 matched-pairs in an experiment.
# Load packages for comparison
library(microbenchmark)
library(sandwich)
library(lmtest)
# Create matched-pairs dataset using fabricatr
set.seed(40)
library(fabricatr)
dat <- fabricate(
blocks = add_level(N = 300),
indiv = add_level(N = 2, z = sample(0:1), y = rnorm(N) + z)
)
head(dat)
## blocks indiv z y
## 1 001 001 1 1.4961828
## 2 001 002 0 -0.8595843
## 3 002 003 1 0.1709400
## 4 002 004 0 -0.3215731
## 5 003 005 1 -0.3037704
## 6 003 006 0 -1.4214866
# With HC2
microbenchmark(
`base + sandwich` = {
lo <- lm(y ~ z + factor(blocks), dat)
coeftest(lo, vcov = vcovHC(lo, type = "HC2"))
},
`lm_robust` = lm_robust(y ~ z + factor(blocks), dat),
`lm_robust + fes` = lm_robust(y ~ z, data = dat, fixed_effects = ~ blocks),
times = 50
)
## Unit: milliseconds
## expr min lq mean median uq max
## base + sandwich 172.20425 180.94536 242.72733 201.00020 287.88194 691.4615
## lm_robust 78.88245 81.85164 91.87740 86.61242 92.16225 165.5988
## lm_robust + fes 50.16316 53.14430 78.23566 55.13875 83.61794 319.5033
## neval cld
## 50 b
## 50 a
## 50 a
Speed gains are considerably greater with HC1 standard errors. This is because we need to get the hat matrix for HC2, HC3, and CR2 standard errors, which requires inverting that large matrix of dummies we previously avoided doing. HC0, HC1, CR0, and CRstata standard errors do not require this inversion.
# With HC1
microbenchmark(
`base + sandwich` = {
lo <- lm(y ~ z + factor(blocks), dat)
coeftest(lo, vcov = vcovHC(lo, type = "HC1"))
},
`lm_robust` = lm_robust(
y ~ z + factor(blocks),
dat,
se_type = "HC1"
),
`lm_robust + fes` = lm_robust(
y ~ z,
data = dat,
fixed_effects = ~ blocks,
se_type = "HC1"
),
times = 50
)
## Unit: milliseconds
## expr min lq mean median uq max
## base + sandwich 175.970195 186.143449 251.11769 214.05035 297.31150 467.16739
## lm_robust 63.779014 67.593627 82.45926 70.67988 89.27966 152.03986
## lm_robust + fes 8.067613 8.579283 12.96284 10.17446 13.97928 65.73965
## neval cld
## 50 c
## 50 b
## 50 a