What If There Isn’t a Single Effect Size? Implications for Power Calculations, Hypothesis Testing, Confidence Intervals and Replications
[From the working paper “The Unappreciated Heterogeneity of Effect Sizes:Implications for Power, Precision, Planning of Research, and Replication” by David Kenny and Charles Judd, posted at Open Science Framework (OSF)]
“The goal of this article is to examine the implications of effect size heterogeneity for power analysis, the precision of effect estimation, and the planning of both original and replication research.”
“…given effect heterogeneity, the power in testing an effect in any particular study is different from what conventional power analyses suggest, and the extent to which this is true depends on the magnitude of the heterogeneity. Whenever a conventional power analyses yields a power value less than .50, an estimate that allows for heterogeneity is greater; and when a conventional analysis yields a power value greater than .50, the estimate given heterogeneity is less.”
“…given some heterogeneity and a small to moderate average effect size, there is a non-trivial chance of finding a significant effect in the opposite direction from the average effect size reported in the literature. …This probability increases as N increases.”
“Many analysts recommend what might be called a one-basket strategy. They put all their eggs in the one basket of a very large N study. … such a strategy is misguided … given the same total N and heterogeneity, multiple studies are better than a single study.”
“In the presence of heterogeneity, our results show that power is not nearly as high as it would seem and that even large N studies may have a non-trivial chance of finding a result in the opposite direction from the original study. This makes us question the wisdom of placing a great deal of faith in a single replication study. The presence of heterogeneity implies that there are a variety of true effects that could be produced.”
Like this:
Like Loading...