[From the article “Randomly auditing research labs could be an affordable way to improve research quality: A simulation study” by Adrian Barnett, Pauline Zardo, and Nicholas Graves, published at PLoS One]
“The “publish or perish” incentive drives many researchers to increase the quantity of their papers at the cost of quality. Lowering quality increases the number of false positive errors which is a key cause of the reproducibility crisis. We adapted a previously published simulation of the research world where labs that produce many papers are more likely to have “child” labs that inherit their characteristics. This selection creates a competitive spiral that favours quantity over quality. To try to halt the competitive spiral we added random audits that could detect and remove labs with a high proportion of false positives, and also improved the behaviour of “child” and “parent” labs who increased their effort and so lowered their probability of making a false positive error. Without auditing, only 0.2% of simulations did not experience the competitive spiral, defined by a convergence to the highest possible false positive probability. Auditing 1.35% of papers avoided the competitive spiral in 71% of simulations, and auditing 1.94% of papers in 95% of simulations. … Audits improved the literature by reducing the number of false positives from 30.2 per 100 papers to 12.3 per 100 papers.”
[From the “2018 Economics Replication Project” posted by Nick Huntington-Klein and Andy Gill of California State University, Fullerton]
“In this project, we are asking recruited researchers to perform a “blind” replication of one of two studies. Without telling researchers the methods used by the original study, we will instruct participants to use a particular data set and set of statistical assumptions in order to estimate a single specific causal estimate. Participants will clean the data, construct variables, and make the other minor decisions that go into a statistical analysis, aside from the data source, identifying assumption, and effect of interest, which will be held constant. By comparing the analyses that different researchers perform under these conditions, we will estimate the variability in estimates that occurs as a result of decisions that researchers make.”
“This approach is different from most replication studies in economics. We are not trying to test the validity of the original results. Instead, our aim is to measure the degree of variation in results that can be attributed to generally “invisible” features of analysis. You may have seen similar tests elsewhere, such as in the New York Times’ The Upshot section. Our project is most similar to the “Crowdsourced Data Analysis” project described by Raphael Silberzahn and Eric Uhlmann here, although our goal is slightly different.”
“If you are interested in joining us, we are looking for researchers who have published at least one published or forthcoming paper in the empirical microeconomics literature and who are familiar with methods of causal identification. Participants will be offered authorship on the final publication. We are also currently working on securing funding. If we do, there may be financial compensation for your time.”
The Journal of Development Economics (JDE) is piloting a new approach in which authors have the opportunity to submit empirical research designs for review and approval before the results of the study are known. While the JDE is the first journal in economics to implement this approach—referred to as “pre-results review”—it joins over 100 other journals from across the sciences.
What is Pre-Results Review?
Pre-results review splits the peer review process into two stages (see Figure 1 below). In Stage 1, authors submit a plan for a prospective research project, typically including a literature review, research question(s), hypotheses, and a detailed methodological framework. This submission is evaluated based on the significance of the research question(s), the soundness of the theoretical reasoning, and the credibility and feasibility of the research design.
Positively evaluated submissions are accepted based on pre-results review. This constitutes a commitment by the journal to publish the full paper, regardless of the nature of the empirical results. Authors will then collect and analyze their data, and submit the final paper for final review and publication (Stage 2). The final Stage 2 review provides quality assurance and ensures alignment with the research design peer reviewed in Stage 1.Why Pre-Results Review?
In development economics, we have long argued for the use of rigorous evidence to inform decisions about public policies. However, incentives in academia and journal publishing often reward studies featuring novel, theoretically tidy, or statistically significant results. Papers that fail to report such findings often go unpublished, even if the studies are of high quality and address important questions. As a result, we are left with an evidence base comprised of papers that tell ‘neat’ and clean stories, but may not accurately represent the world. When such research serves as the foundation for public policies, this publication bias can be costly.
In recent years, pre-results review has emerged as potential alternative model to address publication bias. We hope that this pilot will help us understand the effectiveness of this approach and its sustainability for both the JDE and other social science journals.
What’s in It For You?
– Publication decision earlier in the peer review process;
– Constructive feedback from peer reviewers earlier in the publishing process, with the potential for helpful suggestions for research design before beginning data collection;
– Editorial decisions that are not influenced by the results of a study;
– Inclusion of JDE “acceptance based on pre-results review” on author’s CVs; and
– The chance to be part of an exciting pilot effort in economics!
How to Submit
Submissions should be filed as ‘Registered Reports’ on the JDE’s regular submissions portal.
The Berkeley Initiative for Transparency in the Social Sciences (BITSS) supports authors with pre-registering their research designs and preparing JDE submissions. Please contact Aleks Bogdanoski at firstname.lastname@example.org with any questions.
Established by the Center for Effective Global Action (CEGA) in 2012, the Berkeley Initiative for Transparency in the Social Sciences (BITSS) works to strengthen the integrity of social science research and evidence used for policy-making. The initiative aims to enhance the practices of economists, psychologists, political scientists, and other social scientists in ways that promote research transparency, reproducibility, and openness. Visit www.bitss.org and @UCBITSS on Twitter to learn more, find useful tools and resources, and contribute to the discussion.
[From the article, “One team’s struggle to publish a replication attempt, part 3” by Mante Nieuwland, published at Retraction Watch]
“The purpose of this post was to provide a transparent, behind-the-scenes account of our replication study and what happened when we submitted our study to Nature Neuroscience. On the one hand, I can understand why Nature journals might be hesitant to publish replication studies. It might open the floodgates to a wave of submissions that challenge conclusions from publications in their journal (although that in itself is not necessarily a bad thing).”
“On the other hand, a few things from this case study stand out by clearly contradicting Nature’s commitment to replication and transparency. Nature Neuroscience triaged our study for lack of general interest, failed to follow their own submission procedure in terms of timeline, failed to follow their own policy on data and materials sharing, failed to correct important omissions in the academic record of the original study, and failed to provide, in my opinion, a fair review process (i.e. by relying on one reviewer who faulted us for the lack of clarity due to the original paper, and on one non-expert reviewer who mostly just questioned our intentions and disagreed with the publication format).”
To read the full account, starting from the beginning, click here.
For economics journals that explicitly state they publish replications, click here.
To see a list of replication studies that have actually been published by economics journals, click here.
[This blog is a repost from the article “Publishers cannot afford to be coy about ethical breaches” published April 19th, 2018 in the Times Higher Education by Adam Cox, Russell Craig, and Dennis Tourish.]
There are rising concerns about the reliability of academic research, yet even when papers are retracted, the reasons are often left unexplained.
We recently studied 734 peer-reviewed journals in economics and identified 55 papers retracted for reasons other than “accidental duplication” or “administrative error”. Of those, 28 gave no clear indication of whether any questionable research practice was involved. It appears likely that it was: the reasons given for retraction in the other 27 papers include fake peer review, plagiarism, flawed reasoning, and multiple submission.
For 23 of the 28 “no reason” retractions, it is not even clear who instigated them: the editor alone, the author alone, or both in concert.
This reticence means that other papers by the same authors may not be investigated – as they should be – and are left in circulation. The feelings of authors may be spared, but the disincentives for them and others to engage in malpractice are reduced.
Many publishers refer approvingly to the guidelines of the Committee on Publication Ethics and the International Committee of Medical Journal Editors, which require the disclosure of a clear reason for retraction. However, we found that publishers’ policy statements on retraction are often ambiguous and unclear about what action they will take in response to serious research-related offences.
Perhaps the publishers are reluctant to embarrass themselves. Or perhaps they are intimidated by the possibility of legal action. But apart from their ethical obligations, they should recognise that the growing awareness of malpractice is diminishing public confidence in research integrity.
Publishers will claim that they safeguard research quality by providing a level of editorial scrutiny that keeps poor scholarship out of journals. If that claim is diluted, so is much of that unique selling point. However, if publishers take robust action against malpractice, they will have a stronger claim that they add value to the publishing process when it comes to safeguarding standards.
In our view, the publisher of a journal retracting a paper for research malpractice should be obliged to alert other journals that have published papers by the same authors. In egregious cases, such as those involving data fabrication, those journals’ editors should be required to audit the papers.
Relatedly, publishers should require submitting authors to make their data available in a way that facilitates inspection, re-analysis and replication. This would act as a bulwark against data fraud and poor statistical analysis. Such a requirement is reasonably widespread in the physical and life sciences, but it still tends to be confined to the top echelon of journals in economics. This may help explain why we found no articles retracted because of data fabrication.
Greater diligence is warranted. The Research Papers in Economics Plagiarism Committee is an international group of academic volunteers, mostly economists, who look into possible cases of plagiarism. They are well known in the economics community and have, to date, identified seven papers as involving malpractice. As we write, none of these have been retracted or corrected.
Nor is the social science community particularly diligent at watermarking those papers that are retracted. An article retracted by the American Economic Review in 2007, for instance, is still not identified as retracted anywhere in the document. Failure to mark flawed papers runs the risk that defective work might continue to be cited and influence scholarly thinking.
Journals must be more proactive. Failure to take serious actions against malpractice in scholarly publications is harming the integrity of research. Publishers and editors are critical gatekeepers. They cannot go on demanding full transparency from authors while being so non-transparent themselves.
Adam Cox is a senior lecturer in economics and finance and Russell Craig is professor of accounting and financial management, both at the University of Portsmouth. Dennis Tourish is professor of leadership and organisation studies at the University of Sussex. This is an abridged version of their paper, “Retraction statements and research malpractice in economics”, published in Research Policy.
[From the article, “Why all randomised controlled trials produce biased results”, by Alexander Krauss, recently published in Annals of Medicine]
“Randomised controlled trials (RCTs) are commonly viewed as the best research method to inform public health and social policy. Usually they are thought of as providing the most rigorous evidence of a treatment’s effectiveness without strong assumptions, biases and limitations.”
“This is the first study to examine that hypothesis by assessing the 10 most cited RCT studies worldwide.”
“…This study shows that these world-leading RCTs that have influenced policy produce biased results by illustrating that participants’ background traits that affect outcomes are often poorly distributed between trial groups, that the trials often neglect alternative factors contributing to their main reported outcome and, among many other issues, that the trials are often only partially blinded or unblinded. The study here also identifies a number of novel and important assumptions, biases and limitations not yet thoroughly discussed in existing studies that arise when designing, implementing and analysing trials.”
[From a guest blog entitled “What To Do About The Reproducibility Crisis” by Letisha Wyatt, posted at jove.com]
“Late in my biomedical science graduate training, I learned that researchers often overlook the fundamental processes of adequately preparing data and all associated components (e.g., methods, analysis procedures) for access and preservation. This applies both inside or outside of their workgroups. In my opinion, this fundamental step of preservation has a big role in scientific reproducibility.”
“Ideally, data is FAIR-TLC: findable, accessible, interoperable, reusable, traceable, licensed, and connected. Scientists can learn much (as well as receive support) from librarians on how to best plan and execute FAIR-TLC principles, to better serve reproducible research.”
“Additionally, librarians can also be champions of change through less direct avenues, such as:”
– “Building awareness about the reproducibility crisis”
– “Finding and highlighting useful resources on reproducibility”
– “Defining policies about institutional data management”
– “Advocating for trainee information and data literacy needs”
“What is a successful replication? My students are I wanted to have a clear guide with examples, couldn’t find a clear straightforward article. Here’s a suggested table. Please help, what’s wrong/inaccurate or missing? any other simple criteria to add?”
[This post is based on the report, “The Irreproducibility Crisis of Modern Science: Causes, Consequences and the Road to Reform”, recently published by the National Association of Scholars]
For more than a decade, and especially since the publication of a famous 2005 article by John Ioannidis, scientists in various fields have been concerned with the problems posed by the replication crisis. The importance of the crisis demands that it be understood by a larger audience of educators, policymakers, and ordinary citizens. To this end, our new report, The Irreproducibility Crisis of Modern Science, outlines the nature, causes, and significance of the crisis, and offers a series of proposals for confronting it.
At its most basic level, the crisis arises from the widespread use of statistical methods that inevitably produce some false positives. Misuse of these methods easily increases the number of false positives, leading to the publication of many spurious findings of statistical significance. “P-hacking” (running repeated statistical tests until a finding of significance emerges) is probably the most common abuse of statistical methods, but inadequate specification of hypotheses and the tendentious construction of datasets are also serious problems. (Gelman and Loken 2014 provide several good examples of how easily these latter faults can vitiate research findings.)
Methodological errors and abuses are enabled by too much researcher freedom and too little openness about data and procedures. Researchers’ unlimited freedom in specifying their research designs—and especially their freedom to change their research plans in mid-course—makes it possible to conjure statistical significance even for obviously nonsensical hypotheses (Simmons, Nelson, and Simonsohn 2011 provide a classic demonstration of this). At the same time, lack of outside access to researchers’ data and procedures prevents other experts from identifying problems in experimental design.
Other factors in the irreproducibility crisis exist at the institutional level. Academia and the media create powerful incentives for researchers to advance their careers by publishing new and exciting positive results, while inevitable professional and political tendencies toward groupthink prevent challenges to an existing consensus.
The consequences of all these problems are serious. Not only is a lot of money being wasted—in the United States, up to $28 billion annually on irreproducible preclinical research alone (Freedman et al. 2015)—but individuals and policymakers end up making bad decisions on the basis of faulty science. Perhaps the worst casualty is public confidence in science, as people awaken to how many of the findings they hear about in the news can’t actually be trusted.
Fixing the replication crisis will require energetic efforts to address its causes at every level. Many scientists have already taken up the challenge, and institutions like the Center for Open Science and the Meta-Research Innovation Center at Stanford (METRICS), both in the U.S., have been established to improve the reproducibility of research. Some academic journals have changed the ways in which they ask researchers to present their results, and other journals, such as the International Journal for Re-Views in Empirical Economics, have been created specifically to push back against publication bias by publishing negative results and replication studies. National and international organizations, including the World Health Organization, have begun delineating more stringent research standards.
But much more remains to be done. In an effort to spark an urgently needed public conversation on how to solve the reproducibility crisis, our report offers a series of forty recommendations. At the level of statistics, researchers should cease to regard p-values as dispositive measures of evidence for or against a particular hypothesis, and should try to present their data in ways that avoid a simple either/or determination of statistical significance. Researchers should also pre-register their research procedures and make their methods and data publicly available upon publication of their results. There should also be more experimentation with “born-open” data—data archived in an open-access repository at the moment of its creation, and automatically time-stamped.
Given the importance of statistics in modern science, we need better education at all levels to ensure that everyone—future researchers, journalists, legal professionals, policymakers and ordinary citizens—is well-acquainted with the fundamentals of statistical thinking, including the limits to the certainty that statistical methods can provide. Courses in probability and statistics should be part of all secondary school and university curricula, and graduate programs in disciplines that rely heavily on statistics should take care to emphasize the ways in which researchers can misunderstand and misuse statistical concepts and techniques.
Professional incentives have to change too. Universities judging applications for tenure and promotion should look beyond the number of scholars’ publications, giving due weight to the value of replication studies and expecting adherence to strict standards of reproducibility. Journals should make their peer review processes more transparent, and should experiment with guaranteeing publication for research with pre-registered, peer-reviewed hypotheses and procedures. To combat groupthink, scientific disciplines should ask committees of extradisciplinary professionals to evaluate the openness of their fields.
Private philanthropy, government, and scientific industry should encourage all these efforts through appropriate funding and moral support. Governments also need to consider their role as consumers of science. Many government policies are now made on the basis of scientific findings, and the replication crisis means that those findings demand more careful scrutiny. Governments should take steps to ensure that new regulations which require scientific justification rely solely on research that meets strict standards for reproducibility and openness. They should also review existing regulations and policies to determine which ones may be based on spurious findings.
Solving the replication crisis will require a concerted effort from all sectors of society. But this challenge also represents a great opportunity. As we fight to eliminate opportunities and incentives for bad science, we will be rededicating ourselves to good science and cultivating a deeper public awareness of what good science means. Our report is meant as a step in that direction.
David Randall is Director of Research at the National Association of Scholars (NAS). Christopher Welser is an NAS Research Associate.
Freedman, Leonard P., Iain M. Cockburn, and Timothy S. Simcoe (2015), “The Economics of Reproducibility in Preclinical Research.” PLoS Biology, 13(6), e1002165. doi:10.1371/journal.pbio.1002165
Gelman, Andrew and Eric Loken (2014), “The Statistical Crisis in Science.” American Scientist, 102(6), 460–465.
Ioannidis, John P. A. (2005), “Why Most Published Research Findings Are False.” PLoS Medicine, 2(8), doi:10.1371/journal.pmed.0020124.
Simmons, Joseph P., Leif D. Nelson, and Uri Simonsohn (2011), “False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant.” Psychological Science, 22(11), 1359–1366.
[From the opinion article, “How Bad is the Government’s Science?” by Peter Wood and David Randall, published at http://www.wsj.com]
“A deeper issue is that the irreproducibility crisis has remained largely invisible to the general public and policy makers. That’s a problem given how often the government relies on supposed scientific findings to inform its decisions. Every year the U.S. adds more laws and regulations that could be based on nothing more than statistical manipulations.”
“All government agencies should review the scientific justifications for their policies and regulations to ensure they meet strict reproducibility standards. The economics research that steers decisions at the Federal Reserve and the Treasury Department needs to be rechecked. The social psychology that informs education policy could be entirely irreproducible. The whole discipline of climate science is a farrago of unreliable statistics, arbitrary research techniques and politicized groupthink.”
“The process of policy-making also needs to be overhauled. Federal agencies that give out research grants should immediately adopt the NIH’s new standards for funding reproducible research. Congress should pass a law—call it the Reproducible Science Reform Act—to ensure that all future regulations are based on similar high standards.”