Power is not the point. It’s the curve.

Posted on 4th February 2019 by replicationnetwork

Leave a Comment

[From the blog “Why you shouldn’t say ‘this study is underpowered'” by Richard Morey, posted at Towards Data Science, at Medium. com]

“The first thing to clear up, as I’ve stated above, is that study or an experiment is not underpowered; rather: A design and test combination can be underpowered for detecting hypothetical effect sizes of interest.”

“Suppose worked for a candy company and had determined that our new candy would be either green or purple. We’ve been tasked with finding out whether people like green or purple candy better, so we construct an experiment where we give people both and see which one they reach for first. For each person, the answer is either “green” or “purple”. Let’s call θ the probability of picking purple first, so we’re interested in whether θ>.5 (that is, purple is preferred).”

“Suppose we fix our design at N=50 people picking candy colors. We now need a test. … “If 31 or more people pick purple, we’ll say that purple is preferred (i.e., θ>.5)”. We can now draw the power/sensitivity curve for the design and test, given all the potential, hypothetical effect sizes (shown in the figure to the left, as curve “A”).”

“A “power analysis” is simply noting the features of this curve (perhaps along with changing the potential design by increasing N). Look at curve A. If green candies are preferred (θ<.5) we have a very low chance of mistakenly saying that purple candies are preferred (this is good!). If purple is substantially preferred (θ>.7), we have a good chance of correctly saying that purple is preferred (also good!).”

“Now let’s consider another test for this design: “If 26 or more people pick purple, we’ll say that purple is preferred (θ>.5)”. This could be motivated by saying that we’ll claim that purple is truly preferred whenever the data seem to “prefer” purple. This is curve “B” in the figure above. Let’s do a power analysis. If purple is substantially preferred (θ>.7), we are essentially sure to correctly say that purple is preferred (good!). If green candies are preferred, (θ<.5) we could have a high chance (over 40%) of mistakenly saying that purple candies are preferred (this is bad!).”

“A design sensitivity analysis — what is often called a power analysis — is just making sure the sensitivity is low in the region where the “null” is true (in common lingo, “controlling” α), and making sure the power/sensitivity is high where we’d care about it. None of this has anything to do with “estimating” power from previous results, or anything to do with the actually true effect.”

To read the blog, click here.

Category: NEWS & EVENTS Tags: Design sensitivity analysis, Experimental design, Richard Morey, Statistical power, Underpowered

Leave a comment Cancel reply