To p-Value or Not to p-Value? An Answer From Signal Detection Theory
[From the article “Insights into Criteria for Statistical Significance from Signal Detection Analysis” by Jessica Witt, published in Meta-Psychology]
“… the best criteria for statistical significance are ones that maximize discriminability between real and null effects, not just those that minimize false alarms. One analytic technique that is intended to measure the discriminability of a test is signal detection theory…”
“Signal detection analysis involves categorizing outcomes into four categories. Applied to criteria for statistical significance, a hit occurs when there is a true effect and the analysis correctly identifies it as significant (see Table 1). A miss occurs when there is a true effect but the analysis identifies it as not significant. A correct rejection occurs when there is no effect and the analysis correctly identifies it as not significant, and a false alarm occurs when there is no effect but the analysis identifies it as significant.”

“In statistics, Type I errors (false alarms) and Type II errors (misses) are sometimes considered separately, with Type I errors being a function of the alpha level and Type II errors being a function of power. An advantage of signal detection theory is that it combines Type I and Type II errors into a single analysis of discriminability…”
“…p values were effective, though not perfect, at discriminating between real and null effects.”
“Bayes factor incurs no advantage over p values at detecting a real effect versus a null effect … This is because Bayes factors are redundant with p values for a given sample size.”
“When power is high, researchers using p values to determine statistical significance should use a lower criterion.”
“… a change to be more conservative will decrease false alarm rates at the expense of increasing miss rates. False alarm rates should not be considered in isolation without also considering miss rates. Rather, researchers should consider the relative importance for each in deciding the criterion to adopt.”
“…given that true null results can be theoretically interesting and practically important, a conservative criterion can produce critically misleading interpretations by labeling real effects as if they were null effects.”
“Moving forward, the recommendation is to acknowledge the relationship between false alarms and misses, rather than implement standards based solely on false alarm rates.”
To read the article, click here.
Like this:
Like Loading...
You must be logged in to post a comment.