1

I am using lm_robust of package 'estimatr' for a fixed effect model including HC3 robust standard errors. I had to switch from vcovHC(), because my data sample was just to large to be handled by it.

using following line for the regression:

lm_robust(log(SPREAD) ~ PERIOD, data = dat, fixed_effects = ~ STOCKS + TIME, se_type = "HC3")

The code runs fine, and the coefficients are the same as using fixed effects from package plm. Since I can not use coeftest to estimate HC3 standard errors with the plm output due to a too large data sample, I compared the HC3 estimator of lm_robustwith the HC1 of coeftest(model, vcov= vcovHC(model, type = HC1)) As result the HC3 standarderror of lm_robust is much smaller than HC1 from coeftest.

Does somebody has an explanation, since HC3 should be more restrictive than HC1. I appreciate any recommendations and solutions.

EDIT model used for coeftest:

plm(log(SPREAD) ~ PERIOD, data = dat, index = c("STOCKS", "TIME"), effect = "twoway", method = "within")

4
  • could you share your data? also, be nice to know the size of STOCKS, TIME, and the number of observations Commented Apr 15, 2020 at 15:36
  • Yes for sure; data can be downloaded at github.com/dpendi/data size of STOCKS is 288 and TIME has 312 days. therefore 312 * 288 observations. please see my EDIT above for both code lines used in comparison. Since I am interested in using HC3 estimator but are not able due to computational power needed in coeftest. Commented Apr 15, 2020 at 19:58
  • i get 0.005444 for type_se = "HC1 and 0.005463 for type_se = "HC3 using estimatr_0.18.0. Looks ok to me. Commented Apr 15, 2020 at 23:42
  • Yes, using HC1 and HC3 with the same function (lm_robust) it results in a reasonable order. But I compare the HC3 of lm_robust with the HC1 of coeftest() applied for the plm model Commented Apr 16, 2020 at 7:56

1 Answer 1

2

It appears that the vcovHC() method for plm automatically estimates cluster-robust standard errors, while for lm_robust(), it does not. Therefore, the HC1 estimation of the standard error for plm will appear inflated compared to lm_robust (of lm for that matter).

Using some toy data:

library(sandwich)
library(tidyverse)
library(plm)
library(estimatr)
library(lmtest)

set.seed(1981)
x <- sin(1:1000)
y <- 1 + x + rnorm(1000)
f <- as.character(sort(rep(sample(1:100), 10)))
t <- as.character(rep(sort(sample(1:10)), 100))

dat <- tibble(y = y, x = x, f = f, t = t)

lm_fit <- lm(y ~ x + f + t, data = dat)
plm_fit <- plm(y ~ x, index = c("f", "t"), model = "within", effect = "twoways", data = dat)
rb_fit <- lm_robust(y ~ x, fixed_effects = ~ f + t, data = dat, se_type = "HC1", return_vcov = TRUE)

sqrt(vcovHC(lm_fit, type = "HC1")[2, 2])
#> [1] 0.04752337
sqrt(vcovHC(plm_fit, type = "HC1"))
#>            x
#> x 0.05036414
#> attr(,"cluster")
#> [1] "group"
sqrt(rb_fit$vcov)
#>            x
#> x 0.04752337

rb_fit <- lm_robust(y ~ x, fixed_effects = ~ f + t, data = dat, se_type = "HC3", return_vcov = TRUE)
sqrt(vcovHC(lm_fit, type = "HC3")[2, 2])
#> [1] 0.05041177
sqrt(vcovHC(plm_fit, type = "HC3"))
#>            x
#> x 0.05042142
#> attr(,"cluster")
#> [1] "group"
sqrt(rb_fit$vcov)
#>            x
#> x 0.05041177

There does not appear to be equivalent cluster-robust standard error types in the two packages. However, the SEs get closer when specifying cluster-robust SEs in lm_robust():

rb_fit <- lm_robust(y ~ x, fixed_effects = ~ f + t, clusters = f, data = dat, se_type = "CR0")
summary(rb_fit)
#> 
#> Call:
#> lm_robust(formula = y ~ x, data = dat, clusters = f, fixed_effects = ~f + 
#>     t, se_type = "CR0")
#> 
#> Standard error type:  CR0 
#> 
#> Coefficients:
#>   Estimate Std. Error t value  Pr(>|t|) CI Lower CI Upper DF
#> x    0.925    0.05034   18.38 1.133e-33   0.8251    1.025 99
#> 
#> Multiple R-squared:  0.3664 ,    Adjusted R-squared:  0.2888
#> Multiple R-squared (proj. model):  0.3101 ,  Adjusted R-squared (proj. model):  0.2256 
#> F-statistic (proj. model): 337.7 on 1 and 99 DF,  p-value: < 2.2e-16
coeftest(plm_fit, vcov. = vcovHC(plm_fit, type = "HC1"))
#> 
#> t test of coefficients:
#> 
#>   Estimate Std. Error t value  Pr(>|t|)    
#> x 0.925009   0.050364  18.366 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Created on 2020-04-16 by the reprex package (v0.3.0)

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you very much! Now I understand the difference between the approaches!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.