Lagged Variables Make Good Instruments…Not!
[From the working paper, “Lagged Variables as Instruments” by Yu Wang and Marc Bellemare, posted at http://www.marcfbellemare.com]
“…applied econometricians often settle on less-than-ideal IVs in an effort to “exogenize” x … One such less-than-ideal identication strategy is the use of what we refer to throughout this paper as a ‘lagged IV’.”
“… a lagged IV entails using a lag xi,t-1 … as an IV for xit. The argument that is typically (and often implicitly) made in such cases is that since xi,t-1 precedes xit in time, the causality runs entirely from xi,t-1 to xit, and since there is presumably a high degree of autocorrelation in x, xi,t-1 should be a valid IV for xit.”
“In this paper, we look at the consequences of a lagged IV on the bias of the estimated coefficient …, its root mean squared error (RMSE), and on the likelihood of making a Type I error … We first do so analytically, which allows identifying the precise conditions under which one can use a lagged IV. … We next use Monte Carlo simulations to show what happens to bias, RMSE, and the likelihood of a Type I error for a broad range of the relevant parameters.”
“…we find that if the lagged IV xi,t-1 has no direct causal impact (i) on the dependent variable nor (ii) on the unobserved confounder, it … can mitigate the endogeneity problems by reducing bias and the root mean square error (RMSE) relative to OLS … however, the likelihood of a Type I error remains large.”
“On the other hand we find that if the lagged IV xi,t-1 has a direct causal impact (i) on the dependent variable, on (ii) on the unobserved confounder, or both, … a lagged IV worsens the endogeneity problem by increasing bias as well as the RMSE relative to OLS. Moreover, in such cases, the likelihood of Type I error is almost always equal to one for common ranges of parameter values.”
“In practical terms, this means that the use of a lagged IV often leads one to report coefficient estimates of questionable economic significance (because of the increased bias) and statistical significance (because of the greater likelihood of a Type I error). Worse, the use of lagged IVs will tend to lead one to conclude that a causal relationship exists where it does not.”
“Suppose we have the structural equation
(1) yit = b xit + θ xi,t-1 + δ uit + εit ,
where y, x, and ε respectively denote the dependent variable, the variable of interest and an error term … but where u denotes confounders.”
“We specify two autocorrelation functions: one for x, and one for u, such that
(2) xit = ρ xi,t-1 + κ uit + ηit , and
(3) uit = φ ui,t-1 + ψ xi,t-1 + υit .”
“Using the framework laid out in equations (1), (2), and (3), we can explore four distinct endogeneity scenarios:”
“1. θ = 0 and ψ = 0, i.e., the lagged variable of interest has no direct causal impact on the dependent variable, nor does it have a causal impact on the unobserved confounder.”
“2. θ ≠ 0 and ψ = 0, i.e., the lagged variable of interest has a direct causal impact on the dependent variable, but it does not have a causal impact on the unobserved confounder.”
“3. θ = 0 and ψ ≠ 0, i.e., the lagged variable of interest has no direct causal impact on the dependent variable, but it has a causal impact on the unobserved confounder.”
“4. θ ≠ 0 and ψ ≠ 0, i.e., the lagged variable of interest has a direct causal impact on the dependent variable, and it has a causal impact on the unobserved confounder.”
“In scenario 2, since θ ≠ 0, xi,t-1 directly influences yit via its marginal effect θ.”
“In Scenario 3, although θ = 0, ψ ≠ 0, and xi,t-1 still influences yit via marginal effect θψ, derived from equations (2) and (3). Therefore, both scenario 2 and 3 violate not only the independence assumption, but also the exclusion restriction, and so they will result in biased estimates.”
“A similar result obtains for scenario 4, which is just a combination of the undesirable features (i.e., θ ≠ 0 and ψ ≠ 0) in scenarios 2 and 3.”
“Since in scenario 3, xi,t-1 has a direct impact on uit, which could include more than one unobserved covariate, it implies that xi,t-1 could have more than one causal path whereby it influences yit.”
“…even if theoretical arguments state that the lagged variable of interest has no direct impact on the dependent variable—in other words, even if those arguments make the case that scenario 2 does not hold—it is difficult to argue against the possible existence of scenarios 3 and 4, which both results in a violation of the exclusion restriction.”
“Turning to scenario 1, although the lagged IV in this case has neither a direct causal impact on the dependent variable nor on the unobserved con-founder, the lagged IV may still indirectly be correlated with the dependent variable. Specifically, since ui,t−1 influences both uit and ui,t−1, xi,t-1 and uit have a simultaneous relationship.”
“In other words, as xi,t-1 changes, uit changes contemporaneously (albeit not causally), and so yit changes as well. In this case although xi,t-1 influences yit only through xit, which satisfies the exclusion restriction, the IV xi,t-1 violates the independence assumption because it does not serve as a random exogenous shock.”
“…the independence assumption can only be satisfied by assuming that there are no dynamics among unobserved confounders (Bellemare et al. 2017). Therefore, even if the dynamic causal impacts are restricted and thus exclusion restriction is satisfied, a lagged IV can still be problematic because of the unavoidable violation of independence assumption that it entails.”
“The implications of our findings for the practice of applied econometrics are obvious. Unless one can make the claim that both the independence assumption and the exclusion restriction hold, lagged IVs should be avoided in the name of bias, RMSE, and the likelihood of a Type I error. But given that the independence assumption requires that one make the dubious assumption that there are no dynamics among unobserved confounders, this essentially means that lagged IVs should be avoided entirely.”
To read the article, click here.
Like this:
Like Loading...