Reproducible Workflow: The Movie

This is a great little You Tube video on reproducibility.  1 minute and 44 seconds.  Money back guarantee if you aren’t glad you checked it out.  To see it, click here.

IN THE NEWS: Washington Post (August 25, 2016)

The Washington Post has a story today about “Results-Free Reviewing” (RFR).  What is RFR? When journals review a manuscript without knowing what the results are.  Manuscripts are reviewed purely on whether the question is interesting and whether the experimental design seems appropriate for answering the question.  A recent TRN post discussed a pilot study at the journal Comparative Political Studies (click here).  To read the Washington Post article, click here.

Getting the Publishers Into the Act

At a recent Wiley Executive Seminarparticipants from the academic research and publishing community discussed how open science can reduce research bias.  Two trends are particularly noteworthy.  The first is that the TOP guidelines (Transparency and Openness Promotion) are gaining prominence.  According to the article, the guidelines have been endorsed by 62 organizations, including publishers like Wiley, along with 714 journals.  The second trend is “Registered Reports,” where journals agree to publish research based soley on research design, before the results are known.  According to the article, 20 journals have signed up to provide this option.  To read more, click here.

BYINGTON & FELPS: On Resolving the Social Dilemmas that Lead to Non-Credible Science

In our forthcoming article “Solutions to the credibility crisis in Management science” (full text available here), we suggest that “social dilemmas” in the production of Management science put scholars and journal gatekeepers in a difficult position – pitting self-interest against the production of credible scientific claims. We argue that recognizing that the credibility crisis in Management science is at least partly a consequence of social dilemmas – and treating it as such – are foundational steps that can help move the field toward adopting the variety of credibility enhancing practices that scientists have been advocating for decades (e.g. Ceci & Walker, 1983; N. L. Kerr, 1998).
Although we are Management scholars rather than economists, we suspect that the social dilemma dynamics we point out (and the solutions we propose) are relevant for improving the credibility of claims produced by many fields (e.g., economics, sociology, anthropology, psychology, criminology, education, political science, medicine, etc.). As such, we are grateful for the invitation from The Replication Network to share a summary of our article for your consideration.
ARTICLE SUMMARY:
Credibility Problems in Management Science
The claims of primary studies in Management cannot be fully relied upon, as evidenced by the fact that a) results fail to replicate much more often than they should, and b) attempts to verify and replicate prior claims rarely appear in the literature (Hubbard & Vetter, 1996).
There is reason to believe that the weak replicability of Management findings may be the result of four sets troublingly prevalent researcher behaviors (see full manuscript for evidence of prevalence):
Unacknowledged “Hypothesizing After the Results are Known” (N. L. Kerr, 1998);
— Data manipulation (also known as p-hacking), which involves exploiting researchers “degrees of freedom” – e.g. adding / dropping control variables, dropping uncooperative data points / conditions, using alterative measures / transformations – to find desired results (Goldfarb & King, 2016);
— Data fraud, which involves changing data points or generating data wholesale (John, Loewenstein, & Prelec, 2012);
— Data hoarding, which involves an unwillingness to share data or research materials that would allow others to verify whether one’s data is consistent with one’s published conclusions (Wicherts, Bakker, & Molenaar, 2011).
As demonstrated in studies such as that of Simmons, Nelson, and Simonsohn (2011), such practices can dramatically increase the likelihood of producing “statistically significant” (but ultimately erroneous) findings.
Drivers of Non-Credible Research Practices
We argue that the reason these undesirable research behaviors are so prevalent in Management is that engaging in such behaviors can be beneficial for one’s career, since such behaviors facilitate the production of highly citable (i.e. novel, theory adding, statistically significant) research claims likely to be publishable in high status journals. Of course, engaging in these research behaviors comes with some risk of detection, but the current lack of verification / replication efforts would seem to make the chance of detection low. Thus, scholars are in a social dilemma, where what is good for them individually (i.e. producing highly citable claims) is at odds with what is good for society/science as a whole (producing credible, replicable claims).
There are a variety of journal practices that would significantly decrease the career benefits associated with non-credible research practices, and thus lead to more credible Management science.  They include:
— Frequent publication of high-quality strict replications via dedicated journal space, distinct reviewing criteria for replication studies, provision of replication protocols, and crowd-sourcing replication efforts;
— Enabling robustness checks through in-house analysis checks and altered data submission policies;
— Enabling the publication of null results through registered reports and results-blind review;
— Adopting Open Practice article badges (Center for Open Science, 2015).
However, adoption of these practices has been slow. We propose that one possible reason is that a journal that “sticks its neck out” and adopts these credibility supportive practices is likely to see its status decline.  For example, null findings and replications are rarely cited (Hubbard, 2015), and thus publishing them can reduce a journal’s impact factor. Similarly, requiring scholars to submit their data when competitor journals do not have such a requirement will make the “purist journal” a less attractive publication outlet for scholars, potentially reducing their pool of highly citable submissions. Indeed, each of these credibility enhancing journal practices are likely to lead to research that is both more reliable and less citable. This means that journal gatekeepers (editors and reviewers) are themselves trapped in a social dilemma, where what is good for the journal’s status (i.e. high impact factor relative to “competitor journals”) is at odds with what is good for society/science as a whole (i.e. adopting credibility enhancing practices that help ensure more reliable claims).
Resolving the Social Dilemmas
Fortunately, social science has accumulated great deal of knowledge about how to resolve social dilemmas (Kollock, 1998; Van Lange, Balliet, Parks, & Vugt, 2013). Specifically, we suggest three structural social dilemma solutions, and two motivational social dilemma solutions.
Structural social dilemma solutions involve changing the incentives for journal gatekeepers (Messick & Brewer, 1983). We suggest the following structural social dilemma interventions:
— Define small peer journal groups: A prerequisite for conditional pledges (below) and other social dilemma solutions is identifying a population of peer (i.e. “competitor”) journals.
— Conditional pledges by editors: These are public pledges to adopt certain credibility supportive journal practices if a substantial portion of peer journals also agree to the pledge. This approach is meant to mitigate “relative status costs” of a journal adopting credibility supportive practices.
— Reviewer pledges: Credibility-minded reviewers could themselves publically pledge to (only) review for journals that adopt credibility supportive journal practices to create an incentive for editors to sign onto a conditional pledge with their peer journals.
Motivational social dilemma solutions increase the desire to generously cooperate with others without changing the underlying incentives (Messick & Brewer, 1983). We suggest the following motivational social dilemma interventions:
— Increase multi-journal communication: Editors are more likely to cooperate with other journals in adopting credibility supportive journal practices if they discuss the field-level benefits of doing so face-to-face with their peers (i.e., other editors).
— Inject a moral frame: Journal editors are more likely to adopt credibility supportive journal practices when such practices are framed as a moral imperative.
Across many fields, there is a growing appetite for improving the way science is done. The social dilemma solutions presented in the article build on the belief that the best hope for resolving the credibility crisis in science is in pragmatic (re)consideration of scholars’ and journal gatekeepers’ incentives for producing credible scientific claims. Until then, we are merely rewarding A while hoping for B (S. Kerr, 1975).
REFERENCES
Byington, E. K., & Felps, W. (forthcoming). Solutions to the credibility crisis in management science. Academy of Management Learning & Education.
Ceci, S. J., & Walker, E. (1983). Private archives and public needs. American Psychologist, 38(4), 414–423. http://doi.org/10.1037/0003-066X.38.4.414
Center for Open Science. (2015, January 24). Badges to acknowledge open practices. Retrieved January 26, 2015, from https://osf.io/tvyxz/wiki/1.%20View%20the%20Badges/
Goldfarb, B. D., & King, A. A. (2016). Scientific apophenia in strategic management research. Strategic Management Journal, 37(1), 167–176.
Hubbard, R. (2015). Corrupt research: The case for reconceptualizing empirical management and social science. Newcastle upon Tyne, UK: Sage.
Hubbard, R., & Vetter, D. E. (1996). An empirical comparison of published replication research in accounting, economics, finance, management, and marketing. Journal of Business Research, 35(2), 153–164. http://doi.org/10.1016/0148-2963(95)00084-4
John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524–532. http://doi.org/10.1177/0956797611430953
Kepes, S., Banks, G. C., McDaniel, M., & Whetzel, D. L. (2012). Publication bias in the organizational sciences. Organizational Research Methods, 15(4), 624–662. http://doi.org/10.1177/1094428112452760
Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2(3), 196–217.
Kerr, S. (1975). On the folly of rewarding a, while hoping for b. Academy of Management Journal, 18(4), 769–783. http://doi.org/10.2307/255378
Kollock, P. (1998). Social dilemmas: The anatomy of cooperation. Annual Review of Sociology, 183–214.
Messick, D. M., & Brewer, M. B. (1983). Solving social dilemmas: A review. Review of Personality and Social Psychology, 4(1), 11–44.
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359–1366.
Van Lange, P. A. M., Balliet, D. P., Parks, C. D., & Vugt, M. van. (2013). Social dilemmas: Understanding human cooperation. Oxford, UK: Oxford University Press.
Wicherts, J. M., Bakker, M., & Molenaar, D. (2011). Willingness to share research data is related to the strength of the evidence and the quality of reporting of statistical results. PLOS ONE, 6(11), e26828. http://doi.org/10.1371/journal.pone.0026828

BITSS Announces Call for Papers, Nominations for Leamer-Rosenthal Prize

The annual meeting of the Berkeley Initiative for Transparency in the Social Sciences (BITSS) will be held December 15-16 at Berkeley.  BITSS is currently calling for papers.  The deadline to submit is October 7th.  But wait!  There’s more!  BITSS is also announcing a pre-registration initiative for researchers studying the US elections, and is taking nominations for the Leamer-Rosenthal Prize (including self-nominations).  To learn more about these opportunities, click here.

Confused by “Replicability” Versus “Reproducibility”? You’re Not Alone

“Replication” means a lot of things to a lot of people.  But not necessarily the same thing.  Earlier this year, the National Academy of Sciences published a report on “Reproducibility” that fleshed out some of the subtleties of this concept.  TRN reported on that here.  However, this post is about something that appeared in October of 2015.  The website Language Log published a blog on “Replicability vs. reproducibility — or is it the other way around?” that explains, at least in part, why there is some confusion about what these terms mean.  To read more, click here.

The Journal Comparative Political Studies Tries “Results-Free” Submissions

The website Retraction Watch has an interesting interview with MICHAEL FINDLEY about an experiment undertaken at Comparative Political Studies last year. The journal sponsored a special issue for which they solicited submissions where results were not reported.  Submissions were of two types: (i) planned research in which the results were not yet realized, and (2) completed research in which the results were removed from the paper.  The purpose of the experiment was to see if they could learn anything about publication bias.  The “sample size” was small.  Nineteen submissions, and only three papers accepted under these conditions, two of which had “null results” to some extent.  The interview can be read here.  And a link to the paper discussing the lessons learned from the experiment is here

Is Most Published Research Wrong? The You Tube

This You Tube video, from the channel Veritasium, is a compendium of studies, anecdotes, and initiatives addressing key problems in scientific research.  It includes a compact summary of John Ioannides’ famous paper, “Why Most Published Research Findings Are False“, studies that were published and shouldn’t have been (such as a recent study showing that eating chocolate causes weight loss), the story of the pentaquark — the sub-atomic particle that wasn’t, the research exercise on whether dark-skinned soccer players get red-carded more frequently, a nice discussion of p-hacking, and Brian Nosek’s famous aphorism: “There is no cost to getting things wrong.  The cost is not getting them published.”  All compressed in 12 minutes.  Check it out here.

Everything is F**KED: The Syllabus

Come on, admit it.  This is the course you really want to teach.  Professor Sanjay Srivastava’s PSY607’s weekly topics include:
–Significance testing is f**ked
— Causal inference from experiments is f**ked
— Replicability is f**ked
— Scientific publishing is f**ked
— Meta-analysis is f**ked
— The scientific profession is f**ked
 The reading list makes — no joke here — compelling reading.  And should hold you over until “Everything is F**KED: The Movie” comes out.  Starring Brad Pitt, of course.  To get the full syllabus, click here.

Podcast on “The Replication Crisis”

On August 6th at a conference held at Berkeley (Effective Altruism Global 2016), four panellists discussed “The Replication Crisis”: Brian Nosek, Stuart Buck, Ivan Oransky, and Stephanie Wykstra (moderator = Julia Galef).  Some of the questions addressed were: 
— Is failure to replicate a growing problem? Or just a problem that is now being discovered?
— While psychology has received the most attention, is lack of replicability a problem in other disciplines?
— How much of the replication crisis is due to the behavior of researchers (e.g. p-hacking, well-meaning but subjective choices about how to statistically address a question), versus how much is due to practices of journals (e.g., only publishing significant results)
— How can the incentives in academic disciplines be changed to encourage reproducibility?
The podcast is about an hour long and gives insight into the thinking of some key figures in the reproducibility movement.  To watch the podcast, click here.