Control Variables Don’t Have to Have the Right Signs
[From the blog “Don’t Put Too Much Meaning Into Control Variables” by Paul Hünermund, posted at his website, p-hunermund.com]
“It’s commonplace in regression analyses to not only interpret the effect of the regressor of interest, D, on an outcome variable, Y, but also to discuss the coefficients of the control variables. Researchers then often use lines such as: “effects of the controls have expected signs”, etc. And it probably happened more than once that authors ran into troubles during peer-review because some regression coefficients where not in line with what reviewers expected.”
“… coefficients of control variables do not necessarily have a structural interpretation. Take the following simple example:”
“If we’re interested in estimating the causal effect of X on Y … it’s entirely sufficient to adjust for W1 in this graph. …However, if we estimate the right-hand side, for example, by linear regression, the coefficient of W1 will not represent its effect on Y. It partly picks up the effect of W2 too, since W1 and W2 are correlated.”
“If we would also include W2 in the regression, then the coefficients of the control variables could be interpreted structurally and would represent genuine causal effects. But in practice it’s very unlikely that we’ll be able to measure all causal parents of Y. The data collection efforts could just be too huge in a real-world situation.”
“But that also implies that the coefficients of controls lose their substantive meaning, because they now represent a complicated weighting of several causal influence factors. … if they don’t have expected signs, that’s not a problem.”
You must be logged in to post a comment.