Lagged Variables Make Good Instruments…Not!

[From the working paper, “Lagged Variables as Instruments” by Yu Wang and Marc Bellemare, posted at http://www.marcfbellemare.com]

“…applied econometricians often settle on less-than-ideal IVs in an effort to “exogenize” x … One such less-than-ideal identication strategy is the use of what we refer to throughout this paper as a ‘lagged IV’.”

“… a lagged IV entails using a lag x_i,t-1 … as an IV for x_it. The argument that is typically (and often implicitly) made in such cases is that since x_i,t-1 precedes x_it in time, the causality runs entirely from x_i,t-1 to x_it, and since there is presumably a high degree of autocorrelation in x, x_i,t-1 should be a valid IV for x_it.”

“In this paper, we look at the consequences of a lagged IV on the bias of the estimated coefficient …, its root mean squared error (RMSE), and on the likelihood of making a Type I error … We first do so analytically, which allows identifying the precise conditions under which one can use a lagged IV. … We next use Monte Carlo simulations to show what happens to bias, RMSE, and the likelihood of a Type I error for a broad range of the relevant parameters.”

“…we find that if the lagged IV x_i,t-1 has no direct causal impact (i) on the dependent variable nor (ii) on the unobserved confounder, it … can mitigate the endogeneity problems by reducing bias and the root mean square error (RMSE) relative to OLS … however, the likelihood of a Type I error remains large.”

“On the other hand we find that if the lagged IV x_i,t-1 has a direct causal impact (i) on the dependent variable, on (ii) on the unobserved confounder, or both, … a lagged IV worsens the endogeneity problem by increasing bias as well as the RMSE relative to OLS. Moreover, in such cases, the likelihood of Type I error is almost always equal to one for common ranges of parameter values.”

“In practical terms, this means that the use of a lagged IV often leads one to report coefficient estimates of questionable economic significance (because of the increased bias) and statistical significance (because of the greater likelihood of a Type I error). Worse, the use of lagged IVs will tend to lead one to conclude that a causal relationship exists where it does not.”

“Suppose we have the structural equation

(1) y_it = b x_it + θ x_i,t-1 + δ u_it + ε_it ,

where y, x, and ε respectively denote the dependent variable, the variable of interest and an error term … but where u denotes confounders.”

“We specify two autocorrelation functions: one for x, and one for u, such that

(2) x_it = ρ x_i,t-1 + κ u_it + η_it , and

(3) u_it = φ u_i,t-1 + ψ x_i,t-1 + υ_it .”

“Using the framework laid out in equations (1), (2), and (3), we can explore four distinct endogeneity scenarios:”

“1. θ = 0 and ψ = 0, i.e., the lagged variable of interest has no direct causal impact on the dependent variable, nor does it have a causal impact on the unobserved confounder.”

“2. θ ≠ 0 and ψ = 0, i.e., the lagged variable of interest has a direct causal impact on the dependent variable, but it does not have a causal impact on the unobserved confounder.”

“3. θ = 0 and ψ ≠ 0, i.e., the lagged variable of interest has no direct causal impact on the dependent variable, but it has a causal impact on the unobserved confounder.”

“4. θ ≠ 0 and ψ ≠ 0, i.e., the lagged variable of interest has a direct causal impact on the dependent variable, and it has a causal impact on the unobserved confounder.”

“In scenario 2, since θ ≠ 0, x_i,t-1 directly inﬂuences y_it via its marginal eﬀect θ.”

“In Scenario 3, although θ = 0, ψ ≠ 0, and x_i,t-1 still inﬂuences y_it via marginal eﬀect θψ, derived from equations (2) and (3). Therefore, both scenario 2 and 3 violate not only the independence assumption, but also the exclusion restriction, and so they will result in biased estimates.”

“A similar result obtains for scenario 4, which is just a combination of the undesirable features (i.e., θ ≠ 0 and ψ ≠ 0) in scenarios 2 and 3.”

“Since in scenario 3, x_i,t-1 has a direct impact on u_it, which could include more than one unobserved covariate, it implies that x_i,t-1 could have more than one causal path whereby it inﬂuences y_it.”

“…even if theoretical arguments state that the lagged variable of interest has no direct impact on the dependent variable—in other words, even if those arguments make the case that scenario 2 does not hold—it is diﬃcult to argue against the possible existence of scenarios 3 and 4, which both results in a violation of the exclusion restriction.”

“Turning to scenario 1, although the lagged IV in this case has neither a direct causal impact on the dependent variable nor on the unobserved con-founder, the lagged IV may still indirectly be correlated with the dependent variable. Speciﬁcally, since u_i,t−1 inﬂuences both u_it and u_i,t−1, x_i,t-1 and u_it have a simultaneous relationship.”

“In other words, as x_i,t-1 changes, u_it changes contemporaneously (albeit not causally), and so y_it changes as well. In this case although x_i,t-1 inﬂuences y_it only through x_it, which satisﬁes the exclusion restriction, the IV x_i,t-1 violates the independence assumption because it does not serve as a random exogenous shock.”

“…the independence assumption can only be satisﬁed by assuming that there are no dynamics among unobserved confounders (Bellemare et al. 2017). Therefore, even if the dynamic causal impacts are restricted and thus exclusion restriction is satisﬁed, a lagged IV can still be problematic because of the unavoidable violation of independence assumption that it entails.”

“The implications of our findings for the practice of applied econometrics are obvious. Unless one can make the claim that both the independence assumption and the exclusion restriction hold, lagged IVs should be avoided in the name of bias, RMSE, and the likelihood of a Type I error. But given that the independence assumption requires that one make the dubious assumption that there are no dynamics among unobserved confounders, this essentially means that lagged IVs should be avoided entirely.”

To read the article, click here.

The Replication Network

Furthering the Practice of Replication in Economics

Lagged Variables Make Good Instruments…Not!

[From the working paper, “Lagged Variables as Instruments” by Yu Wang and Marc Bellemare, posted at http://www.marcfbellemare.com]

“…applied econometricians often settle on less-than-ideal IVs in an effort to “exogenize” x … One such less-than-ideal identication strategy is the use of what we refer to throughout this paper as a ‘lagged IV’.”

“Suppose we have the structural equation

(1) y_it = b x_it + θ x_i,t-1 + δ u_it + ε_it ,

where y, x, and ε respectively denote the dependent variable, the variable of interest and an error term … but where u denotes confounders.”

“We specify two autocorrelation functions: one for x, and one for u, such that

(2) x_it = ρ x_i,t-1 + κ u_it + η_it , and

(3) u_it = φ u_i,t-1 + ψ x_i,t-1 + υ_it .”

“Using the framework laid out in equations (1), (2), and (3), we can explore four distinct endogeneity scenarios:”

“1. θ = 0 and ψ = 0, i.e., the lagged variable of interest has no direct causal impact on the dependent variable, nor does it have a causal impact on the unobserved confounder.”

“2. θ ≠ 0 and ψ = 0, i.e., the lagged variable of interest has a direct causal impact on the dependent variable, but it does not have a causal impact on the unobserved confounder.”

“3. θ = 0 and ψ ≠ 0, i.e., the lagged variable of interest has no direct causal impact on the dependent variable, but it has a causal impact on the unobserved confounder.”

“4. θ ≠ 0 and ψ ≠ 0, i.e., the lagged variable of interest has a direct causal impact on the dependent variable, and it has a causal impact on the unobserved confounder.”

“In scenario 2, since θ ≠ 0, x_i,t-1 directly inﬂuences y_it via its marginal eﬀect θ.”

“A similar result obtains for scenario 4, which is just a combination of the undesirable features (i.e., θ ≠ 0 and ψ ≠ 0) in scenarios 2 and 3.”

“Since in scenario 3, x_i,t-1 has a direct impact on u_it, which could include more than one unobserved covariate, it implies that x_i,t-1 could have more than one causal path whereby it inﬂuences y_it.”

To read the article, click here.

Leave a comment Cancel reply

Tags

Blogroll

Recent Posts

The Replication Network

Furthering the Practice of Replication in Economics

Lagged Variables Make Good Instruments…Not!

[From the working paper, “Lagged Variables as Instruments” by Yu Wang and Marc Bellemare, posted at http://www.marcfbellemare.com]

“…applied econometricians often settle on less-than-ideal IVs in an effort to “exogenize” x … One such less-than-ideal identication strategy is the use of what we refer to throughout this paper as a ‘lagged IV’.”

“Suppose we have the structural equation

(1) yit = b xit + θ xi,t-1 + δ uit + εit ,

where y, x, and ε respectively denote the dependent variable, the variable of interest and an error term … but where u denotes confounders.”

“We specify two autocorrelation functions: one for x, and one for u, such that

(2) xit = ρ xi,t-1 + κ uit + ηit , and

(3) uit = φ ui,t-1 + ψ xi,t-1 + υit .”

“Using the framework laid out in equations (1), (2), and (3), we can explore four distinct endogeneity scenarios:”

“1. θ = 0 and ψ = 0, i.e., the lagged variable of interest has no direct causal impact on the dependent variable, nor does it have a causal impact on the unobserved confounder.”

“2. θ ≠ 0 and ψ = 0, i.e., the lagged variable of interest has a direct causal impact on the dependent variable, but it does not have a causal impact on the unobserved confounder.”

“3. θ = 0 and ψ ≠ 0, i.e., the lagged variable of interest has no direct causal impact on the dependent variable, but it has a causal impact on the unobserved confounder.”

“4. θ ≠ 0 and ψ ≠ 0, i.e., the lagged variable of interest has a direct causal impact on the dependent variable, and it has a causal impact on the unobserved confounder.”

“In scenario 2, since θ ≠ 0, xi,t-1 directly inﬂuences yit via its marginal eﬀect θ.”

“A similar result obtains for scenario 4, which is just a combination of the undesirable features (i.e., θ ≠ 0 and ψ ≠ 0) in scenarios 2 and 3.”

“Since in scenario 3, xi,t-1 has a direct impact on uit, which could include more than one unobserved covariate, it implies that xi,t-1 could have more than one causal path whereby it inﬂuences yit.”

To read the article, click here.

Share this:

Leave a comment Cancel reply

Tags

Blogroll

Recent Posts

(1) y_it = b x_it + θ x_i,t-1 + δ u_it + ε_it ,

(2) x_it = ρ x_i,t-1 + κ u_it + η_it , and

(3) u_it = φ u_i,t-1 + ψ x_i,t-1 + υ_it .”

“In scenario 2, since θ ≠ 0, x_i,t-1 directly inﬂuences y_it via its marginal eﬀect θ.”

“Since in scenario 3, x_i,t-1 has a direct impact on u_it, which could include more than one unobserved covariate, it implies that x_i,t-1 could have more than one causal path whereby it inﬂuences y_it.”