*[From the blog “Power to the Plan” by Clare Leaver, Owen Ozier, Pieter Serneels, and Andrew Zeitlin, posted at BITSS]*

###### “…Our blinded pre-analytical work uncovered two decision margins that could deliver substantial increases in power: **changing test statistics used **and **putting structure on a model for error terms**. Because the value of these decisions depends on things that are hard to know ex ante — even using baseline data — they create a case for blinded analysis of endline outcomes. We argue that there are circumstances in which this can be done without risk of p-hacking, and in which the power gains from these decision margins are substantial.”

###### “…Kolmogorov-Smirnov (KS) tests can be better powered than OLS t-tests by a factor of four, even under additive treatment effects.”

###### “…Remember how machine learning is a way of getting a better fit using observables? Imposing structure on error terms is a way of getting a better fit on the *unobserved* sources of variation. That structure can take many forms: it can relate to the correlations between units, the distribution of residuals (normal? pareto?), or both. *Imbens and Rubin (2015, p. 68)* observe that test statistics derived from structural estimates — for example, expressly modeling the error term — can improve power to the extent that they represent a “good descriptive approximation” to the data generating process. Blinded endline data allowed us to learn about the quality of such approximations, with substantial consequences.”

*Imbens and Rubin (2015, p. 68)*

###### “In our setting, when we turned to look at effects on student outcomes, we intended to use a linear model … But there were still a number of potential correlations to consider: some students are observed at more than one point in time; each student has multiple teachers, and schools may have both incumbents and teachers recruited under a variety of contract expectations. Linear mixed-effects (LME) models provide an avenue for implementing this.”

###### “Our LME model, which assumes normally distributed error terms that include a common shock at the pupil level, delivers an estimator of the effect of interest that has a standard deviation as much as 30 percent smaller than the equivalent OLS estimator. Because normality is a reasonable approximation to these error terms, the structure of LME allows it to outperform traditional random-effects. The gains from LME are conceptually comparable to an increase of 70 percent in sample size.”

###### “…Endline data are often far from normal and correlation structures across units are hard to know ex ante. A blinded endline approach can be a useful substitute for tools like *DeclareDesign* in cases where baseline data, or a realistic basis for simulating the endline data-generating process, are not available.

*DeclareDesign*###### “There is broad consensus that well-powered studies are important, not least because they make null results more informative. Consequently, researchers invest a lot in statistical power. Our recent experience suggests that blinded analyses — whether based on pooled or partial endline data — can be a useful tool to make informed choices of models and test statistics that improve power.”

###### To read more, **click here**.

**click here**