*[From the blog “Don’t Put Too Much Meaning Into Control Variables” by Paul Hünermund, posted at his website, *__p-hunermund.com__]

__p-hunermund.com__]

###### “It’s commonplace in regression analyses to not only interpret the effect of the regressor of interest, *D,* on an outcome variable, *Y*, but also to discuss the coefficients of the control variables. Researchers then often use lines such as: *“effects of the controls have expected signs”*, etc. And it probably happened more than once that authors ran into troubles during peer-review because some regression coefficients where not in line with what reviewers expected.”

###### “… coefficients of control variables do not necessarily have a *structural *interpretation. Take the following simple example:”

###### “If we’re interested in estimating the causal effect of *X* on *Y* *…* it’s entirely sufficient to adjust for *W1* in this graph. …However, if we estimate the right-hand side, for example, by linear regression, the coefficient of *W1* will not represent its effect on *Y. *It partly picks up the effect of *W2* too, since *W1* and *W2* are correlated.”

###### “If we would also include *W2* in the regression, then the coefficients of the control variables could be interpreted structurally and would represent genuine causal effects. But in practice it’s very unlikely that we’ll be able to measure all causal parents of *Y*. The data collection efforts could just be too huge in a real-world situation.”

###### “But that also implies that the coefficients of controls lose their substantive meaning, because they now represent a complicated weighting of several causal influence factors. … if they don’t have expected signs, that’s not a problem.”

###### To read more, **click here**.

**click here**

You must be logged in to post a comment.