[This blog is a repost from the article “Publishers cannot afford to be coy about ethical breaches” published April 19th, 2018 in the Times Higher Education by Adam Cox, Russell Craig, and Dennis Tourish.]
There are rising concerns about the reliability of academic research, yet even when papers are retracted, the reasons are often left unexplained.
We recently studied 734 peer-reviewed journals in economics and identified 55 papers retracted for reasons other than “accidental duplication” or “administrative error”. Of those, 28 gave no clear indication of whether any questionable research practice was involved. It appears likely that it was: the reasons given for retraction in the other 27 papers include fake peer review, plagiarism, flawed reasoning, and multiple submission.
For 23 of the 28 “no reason” retractions, it is not even clear who instigated them: the editor alone, the author alone, or both in concert.
This reticence means that other papers by the same authors may not be investigated – as they should be – and are left in circulation. The feelings of authors may be spared, but the disincentives for them and others to engage in malpractice are reduced.
Many publishers refer approvingly to the guidelines of the Committee on Publication Ethics and the International Committee of Medical Journal Editors, which require the disclosure of a clear reason for retraction. However, we found that publishers’ policy statements on retraction are often ambiguous and unclear about what action they will take in response to serious research-related offences.
Perhaps the publishers are reluctant to embarrass themselves. Or perhaps they are intimidated by the possibility of legal action. But apart from their ethical obligations, they should recognise that the growing awareness of malpractice is diminishing public confidence in research integrity.
Publishers will claim that they safeguard research quality by providing a level of editorial scrutiny that keeps poor scholarship out of journals. If that claim is diluted, so is much of that unique selling point. However, if publishers take robust action against malpractice, they will have a stronger claim that they add value to the publishing process when it comes to safeguarding standards.
In our view, the publisher of a journal retracting a paper for research malpractice should be obliged to alert other journals that have published papers by the same authors. In egregious cases, such as those involving data fabrication, those journals’ editors should be required to audit the papers.
Relatedly, publishers should require submitting authors to make their data available in a way that facilitates inspection, re-analysis and replication. This would act as a bulwark against data fraud and poor statistical analysis. Such a requirement is reasonably widespread in the physical and life sciences, but it still tends to be confined to the top echelon of journals in economics. This may help explain why we found no articles retracted because of data fabrication.
Greater diligence is warranted. The Research Papers in Economics Plagiarism Committee is an international group of academic volunteers, mostly economists, who look into possible cases of plagiarism. They are well known in the economics community and have, to date, identified seven papers as involving malpractice. As we write, none of these have been retracted or corrected.
Nor is the social science community particularly diligent at watermarking those papers that are retracted. An article retracted by the American Economic Review in 2007, for instance, is still not identified as retracted anywhere in the document. Failure to mark flawed papers runs the risk that defective work might continue to be cited and influence scholarly thinking.
Journals must be more proactive. Failure to take serious actions against malpractice in scholarly publications is harming the integrity of research. Publishers and editors are critical gatekeepers. They cannot go on demanding full transparency from authors while being so non-transparent themselves.
Adam Cox is a senior lecturer in economics and finance and Russell Craig is professor of accounting and financial management, both at the University of Portsmouth. Dennis Tourish is professor of leadership and organisation studies at the University of Sussex. This is an abridged version of their paper, “Retraction statements and research malpractice in economics”, published in Research Policy.
[From the article, “Why all randomised controlled trials produce biased results”, by Alexander Krauss, recently published in Annals of Medicine]
“Randomised controlled trials (RCTs) are commonly viewed as the best research method to inform public health and social policy. Usually they are thought of as providing the most rigorous evidence of a treatment’s effectiveness without strong assumptions, biases and limitations.”
“This is the first study to examine that hypothesis by assessing the 10 most cited RCT studies worldwide.”
“…This study shows that these world-leading RCTs that have influenced policy produce biased results by illustrating that participants’ background traits that affect outcomes are often poorly distributed between trial groups, that the trials often neglect alternative factors contributing to their main reported outcome and, among many other issues, that the trials are often only partially blinded or unblinded. The study here also identifies a number of novel and important assumptions, biases and limitations not yet thoroughly discussed in existing studies that arise when designing, implementing and analysing trials.”
[From a guest blog entitled “What To Do About The Reproducibility Crisis” by Letisha Wyatt, posted at jove.com]
“Late in my biomedical science graduate training, I learned that researchers often overlook the fundamental processes of adequately preparing data and all associated components (e.g., methods, analysis procedures) for access and preservation. This applies both inside or outside of their workgroups. In my opinion, this fundamental step of preservation has a big role in scientific reproducibility.”
“Ideally, data is FAIR-TLC: findable, accessible, interoperable, reusable, traceable, licensed, and connected. Scientists can learn much (as well as receive support) from librarians on how to best plan and execute FAIR-TLC principles, to better serve reproducible research.”
“Additionally, librarians can also be champions of change through less direct avenues, such as:”
– “Building awareness about the reproducibility crisis”
– “Finding and highlighting useful resources on reproducibility”
– “Defining policies about institutional data management”
– “Advocating for trainee information and data literacy needs”
– “Fostering programming and similar communities”
[From a twitter post by Gilad Feldman, ]
“What is a successful replication? My students are I wanted to have a clear guide with examples, couldn’t find a clear straightforward article. Here’s a suggested table. Please help, what’s wrong/inaccurate or missing? any other simple criteria to add?”

Any thoughts? Respond to Gilad at
[This post is based on the report, “The Irreproducibility Crisis of Modern Science: Causes, Consequences and the Road to Reform”, recently published by the National Association of Scholars]
For more than a decade, and especially since the publication of a famous 2005 article by John Ioannidis, scientists in various fields have been concerned with the problems posed by the replication crisis. The importance of the crisis demands that it be understood by a larger audience of educators, policymakers, and ordinary citizens. To this end, our new report, The Irreproducibility Crisis of Modern Science, outlines the nature, causes, and significance of the crisis, and offers a series of proposals for confronting it.
At its most basic level, the crisis arises from the widespread use of statistical methods that inevitably produce some false positives. Misuse of these methods easily increases the number of false positives, leading to the publication of many spurious findings of statistical significance. “P-hacking” (running repeated statistical tests until a finding of significance emerges) is probably the most common abuse of statistical methods, but inadequate specification of hypotheses and the tendentious construction of datasets are also serious problems. (Gelman and Loken 2014 provide several good examples of how easily these latter faults can vitiate research findings.)
Methodological errors and abuses are enabled by too much researcher freedom and too little openness about data and procedures. Researchers’ unlimited freedom in specifying their research designs—and especially their freedom to change their research plans in mid-course—makes it possible to conjure statistical significance even for obviously nonsensical hypotheses (Simmons, Nelson, and Simonsohn 2011 provide a classic demonstration of this). At the same time, lack of outside access to researchers’ data and procedures prevents other experts from identifying problems in experimental design.
Other factors in the irreproducibility crisis exist at the institutional level. Academia and the media create powerful incentives for researchers to advance their careers by publishing new and exciting positive results, while inevitable professional and political tendencies toward groupthink prevent challenges to an existing consensus.
The consequences of all these problems are serious. Not only is a lot of money being wasted—in the United States, up to $28 billion annually on irreproducible preclinical research alone (Freedman et al. 2015)—but individuals and policymakers end up making bad decisions on the basis of faulty science. Perhaps the worst casualty is public confidence in science, as people awaken to how many of the findings they hear about in the news can’t actually be trusted.
Fixing the replication crisis will require energetic efforts to address its causes at every level. Many scientists have already taken up the challenge, and institutions like the Center for Open Science and the Meta-Research Innovation Center at Stanford (METRICS), both in the U.S., have been established to improve the reproducibility of research. Some academic journals have changed the ways in which they ask researchers to present their results, and other journals, such as the International Journal for Re-Views in Empirical Economics, have been created specifically to push back against publication bias by publishing negative results and replication studies. National and international organizations, including the World Health Organization, have begun delineating more stringent research standards.
But much more remains to be done. In an effort to spark an urgently needed public conversation on how to solve the reproducibility crisis, our report offers a series of forty recommendations. At the level of statistics, researchers should cease to regard p-values as dispositive measures of evidence for or against a particular hypothesis, and should try to present their data in ways that avoid a simple either/or determination of statistical significance. Researchers should also pre-register their research procedures and make their methods and data publicly available upon publication of their results. There should also be more experimentation with “born-open” data—data archived in an open-access repository at the moment of its creation, and automatically time-stamped.
Given the importance of statistics in modern science, we need better education at all levels to ensure that everyone—future researchers, journalists, legal professionals, policymakers and ordinary citizens—is well-acquainted with the fundamentals of statistical thinking, including the limits to the certainty that statistical methods can provide. Courses in probability and statistics should be part of all secondary school and university curricula, and graduate programs in disciplines that rely heavily on statistics should take care to emphasize the ways in which researchers can misunderstand and misuse statistical concepts and techniques.
Professional incentives have to change too. Universities judging applications for tenure and promotion should look beyond the number of scholars’ publications, giving due weight to the value of replication studies and expecting adherence to strict standards of reproducibility. Journals should make their peer review processes more transparent, and should experiment with guaranteeing publication for research with pre-registered, peer-reviewed hypotheses and procedures. To combat groupthink, scientific disciplines should ask committees of extradisciplinary professionals to evaluate the openness of their fields.
Private philanthropy, government, and scientific industry should encourage all these efforts through appropriate funding and moral support. Governments also need to consider their role as consumers of science. Many government policies are now made on the basis of scientific findings, and the replication crisis means that those findings demand more careful scrutiny. Governments should take steps to ensure that new regulations which require scientific justification rely solely on research that meets strict standards for reproducibility and openness. They should also review existing regulations and policies to determine which ones may be based on spurious findings.
Solving the replication crisis will require a concerted effort from all sectors of society. But this challenge also represents a great opportunity. As we fight to eliminate opportunities and incentives for bad science, we will be rededicating ourselves to good science and cultivating a deeper public awareness of what good science means. Our report is meant as a step in that direction.
David Randall is Director of Research at the National Association of Scholars (NAS). Christopher Welser is an NAS Research Associate.
References
Freedman, Leonard P., Iain M. Cockburn, and Timothy S. Simcoe (2015), “The Economics of Reproducibility in Preclinical Research.” PLoS Biology, 13(6), e1002165. doi:10.1371/journal.pbio.1002165
Gelman, Andrew and Eric Loken (2014), “The Statistical Crisis in Science.” American Scientist, 102(6), 460–465.
Ioannidis, John P. A. (2005), “Why Most Published Research Findings Are False.” PLoS Medicine, 2(8), doi:10.1371/journal.pmed.0020124.
Simmons, Joseph P., Leif D. Nelson, and Uri Simonsohn (2011), “False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant.” Psychological Science, 22(11), 1359–1366.
[From the opinion article, “How Bad is the Government’s Science?” by Peter Wood and David Randall, published at http://www.wsj.com]
“A deeper issue is that the irreproducibility crisis has remained largely invisible to the general public and policy makers. That’s a problem given how often the government relies on supposed scientific findings to inform its decisions. Every year the U.S. adds more laws and regulations that could be based on nothing more than statistical manipulations.”
“All government agencies should review the scientific justifications for their policies and regulations to ensure they meet strict reproducibility standards. The economics research that steers decisions at the Federal Reserve and the Treasury Department needs to be rechecked. The social psychology that informs education policy could be entirely irreproducible. The whole discipline of climate science is a farrago of unreliable statistics, arbitrary research techniques and politicized groupthink.”
“The process of policy-making also needs to be overhauled. Federal agencies that give out research grants should immediately adopt the NIH’s new standards for funding reproducible research. Congress should pass a law—call it the Reproducible Science Reform Act—to ensure that all future regulations are based on similar high standards.”
[From the White paper, “Practical Challenges for Researchers in Data Sharing”, posted at springernature.com]
“In one of the largest surveys of researchers about research data (with over 7,700 respondents), Springer Nature finds widespread data sharing associated with published works and a desire from researchers that their data are discoverable.….63% of respondents stated that they generally submit data files as supplementary information, deposit the files in a repository, or both. 76% of researchers rated the importance of making their data discoverable highly – with an average rating of 7.3 out of 10 and the most popular rating being 10 out of 10 (25%). “
“The results suggest two areas of focus that could increase the sharing of data amongst researchers, regardless of subject specialism or location:”
–“Increased education and support on good data management for all researchers, but particularly at early stages of researcher’s careers.”
–“Faster, easier routes to optimal ways of sharing data.”
NOTE FROM TRN: Anybody else surprised by how keen researchers say they are to share their data? Doesn’t seem to be true in my disciplinary neighborhood.
[From the article “A survey on data reproducibility and the effect of publication process on the ethical reporting of laboratory research,” forthcoming in the journal Clinical Cancer Research]
“We developed an anonymous online survey intended for trainees involved in bench research. The survey included questions related to mentoring/career development, research practice, integrity and transparency, and how the pressure to publish, and the publication process itself influence their reporting practices.”
“…39.2% revealed having been pressured by a principle investigator or collaborator to produce “positive” data. 62.8% admitted that the pressure to publish influences the way they report data”
“… This survey indicates that trainees believe that the pressure to publish impacts honest reporting, mostly emanating from our system of rewards and advancement. The publication process itself impacts faculty and trainees and appears to influence a shift in their ethics from honest reporting (“negative data”) to selective reporting, data falsification, or even fabrication.”
[From the preprint article “Researcher conduct determines data reliability” by Mark Wass, Larry Ray, and Martin Michaelis]
“Our findings demonstrate the need for systematic meta-research on the issue of data reproducibility. A reproducibility crisis is widely recognised among researchers from many different fields. There is no shortage of suggestions on how data reproducibility could be improved, but quantitative data on the subject (including the scale of the problem) are largely missing.”
Andrew Gelman had a great post yesterday that highlights a major issue — a really major issue — with replication. The problem is, there is no commonly accepted definition of what a “replication” is. Even when a definition is provided, there is no commonly accepted standard for how to interpret the results of a replication.
The post consists of a series of email excerpts between the author of an original study (Dan Kahan) and the co-authors of a study that claimed “failure to replicate” his study (Christina Ballarini and Steve Sloman), with occasional commentary from Gelman.
The post goes like this:
— Kahan emails Ballarini and Sloman to dispute their claim that they “failed to replicate” his study.
— Ballarini and Sloman both agree that they should not have said their study “failed to replicate” Kahan’s.
— Kahan asks that they make an effort to publicly correct the record.
— Sloman responds by saying that he didn’t really mean that their study didn’t “fail to replicate.” He says “I stand by our report even if you didn’t like one of our verbs [replicate].”
— Kahan then writes a paper refuting the claim that Ballarini and Sloman “failed to replicate” his research (Title of paper = “Rumors of the ‘Nonreplication’ of the ‘Motivated Numeracy Effect’ are Greatly Exaggerated”)
Kahan’s conclusion: “This is a case study in how replication can easily go off the rails. The same types of errors people make in non-replicated papers will now be used in replications.”
Alternatively, one could argue this is NOT a case study in how replication can easily go off the rails. Rather, it illustrates that there are no rails.
To read Gelman’s post in its entirety, click here.
You must be logged in to post a comment.