Could Bayes Have Saved Us From the Replication Crisis?

[Excerpts are taken from the article “The Flawed Reasoning Behind the Replication Crisis” by Aubrey Clayton, published at]
“Suppose an otherwise healthy woman in her forties notices a suspicious lump in her breast and goes in for a mammogram. The report comes back that the lump is malignant.”
“She wants to know the chance of the diagnosis being wrong. Her doctor answers that, as diagnostic tools go, these scans are very accurate. Such a scan would find nearly 100 percent of true cancers and would only misidentify a benign lump as cancer about 5 percent of the time. Therefore, the probability of this being a false positive is very low, about 1 in 20.”
“…the missing ingredient …is the prior probability for the various hypotheses.”
“For the breast cancer example, the doctor would need to consider the overall incidence rate of cancer among similar women with similar symptoms, not including the result of the mammogram. Maybe a physician would say from experience that about 99 percent of the time a similar patient finds a lump it turns out to be benign. So the low prior chance of a malignant tumor would balance the low chance of getting a false positive scan result. Here we would weigh the numbers: (0.05) * (0.99)  vs. (1) * (0.01).”
“We’d find there was about an 83 percent chance the patient doesn’t have cancer.”
“The problem, though, is the dominant mode of statistical analysis these days isn’t Bayesian. Since the 1920s, the standard approach to judging scientific theories has been significance testing, made popular by the statistician Ronald Fisher.”
“To understand what’s wrong, consider the following completely true, Fisherian summary of the facts in the breast cancer example (no false negatives, 5 percent false positive rate):”
“Suppose we scan 1 million similar women, and we tell everyone who tests positive that they have cancer. Then, among those who actually have cancer, we will be correct every single time. And among those who don’t have it, we will be only be incorrect 5 percent of the time. So, overall our procedure will be incorrect less than 5 percent of the time.”
“Sounds persuasive, right? But here’s another summary of the facts, including the base rate of 1 percent:”
“Suppose we scan 1 million similar women, and we tell everyone who tests positive that they have cancer. Then we will have correctly told all 10,000 women with cancer that they have it. Of the remaining 990,000 women whose lumps were benign, we will incorrectly tell 49,500 women that they have cancer. Therefore, of the women we identify as having cancer, about 83 percent will have been incorrectly diagnosed.”
“Suppose the women who received positive test results and a presumptive diagnosis of cancer in our example were tested again by having biopsies. We would see the majority of the initial results fail to repeat, a “crisis of replication” in cancer diagnoses. That’s exactly what’s happening in science today.”
“We Bayesians have seen this coming for years. … Now, a consensus is finally beginning to emerge: Something is wrong with science that’s causing established results to fail. One proposed and long overdue remedy has been an overhaul of the use of statistics.”
“In 2015, the journal Basic and Applied Social Psychology took the drastic measure of banning the use of significance testing in all its submissions, and this March, an editorial in Nature co-signed by more than 800 authors argued for abolishing the use of statistical significance altogether.”
“Similar proposals have been tried in the past, but every time the resistance has been beaten back and significance testing has remained the standard. Maybe this time the fear of having a career’s worth of results exposed as irreproducible will provide scientists with the extra motivation they need.”
To read the article, click here.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: