A Random Sampling of the State of Transparency and Reproducibility in Social Science Journals
[From the preprint “An empirical assessment of transparency and reproducibility-related research practices in the social sciences (2014-2017)” by Tom Hardwicke, Joshua Wallach, Mallory Kidwell, & John Ioannidis posted at MetaArXiv Preprints]
“In this study, we evaluated a broad range of indicators related to transparency and reproducibility in a random sample of 198 articles published in the social sciences between 2014 and 2017.”
“…sample characteristics are displayed in Table 2.”
“Among the 198 eligible articles, 95…had a publicly available version (Fig. 1A) whereas 84…were only accessible to us through a paywall, …19…additional articles were not available to our team, highlighting that even researchers with broad academic access privileges cannot reach portion of the scientific literature.”
“Raw data are the core evidence that underlies scientific claims. However, the vast majority of 103 articles we examined did not contain a data availability statement (Fig. 1D). Eight articles stated that they had used an external data source but it was unclear if the data were available. Among a further 8…datasets that were reportedly available, 2 were reportedly ‘available upon request’ from the authors and 2 had broken links to supplementary materials and a personal/institutional webpage. Of the 4 accessible datasets…2 were both incomplete and not clearly documented.”
“Analysis scripts provide detailed step-by-step descriptions of performed analyses, often in the form of computer code (e.g., R, Python, or Matlab) or syntax (SPSS, Stata, SAS). Although 3 of 103 … articles reported that analysis scripts were available (Fig. 1E), 1 of these involved a broken link to journal-hosted supplementary information and 1 was only “available on request”, which we did not attempt to confirm. The 1 available analysis script was provided in journal-hosted supplementary information.”
“Replication studies repeat the methods of previous studies in order to systematically gather evidence on a given research question. Evidence gathered across multiple relevant studies can be formally collated and synthesized through systematic reviews and meta-analyses. Only 2 of the 103 … articles we examined explicitly self-identified as a replication study.”
“Pre-registration refers to the archiving of a read-only, time-stamped study protocol in a public repository (such as the Open Science Framework) prior to study commencement….None of the articles specified that the study was pre-registered…”
“Our empirical assessment of a random sample of articles published between 2014 and 2017 suggests a serious neglect of transparency and reproducibility in the social sciences. Most research resources, such as materials, protocols, raw data, and analysis scripts, were not explicitly available, no studies were pre-registered, disclosure of funding sources and conflict of interests was modest, and replication or evidence synthesis via meta-analysis or systematic review was rare.”
To read the article, click here.
Not only is the sample size incredibly small – less than 200 studies out of more than 6000 journals categorized as “social science” by Scopus over 4 years (if each journal publishes four issues a year with five articles each it makes already nearly half a million published studies) – if you look at the actual data https://osf.io/2pqhw/ you also see it includes studies from journals like Plastics Engineering, CHEMICAL ENGINEERING TRANSACTIONS, British Medical Journal Open, Journal of Engineering for Gas Turbines and Power, Frontiers of Information Technology and Electronic Engineering, European Journal of Paediatric Dentistry, Carpathian Journal of Earth and Environmental Sciences, Wounds UK, Canadian Family Physician, Journal of Clinical Urology, European Journal of Philosophy, Archives of Physical Medicine and Rehabilitation, Journal of Nutrition and Health, Zhongguo Jixie Gongcheng/China Mechanical Engineering, Computers in Industry, Journal of Musicology, Pacific Historical Review, twice The Chaucer Review (A Journal of Medieval Studies and Literary Criticism), International Journal of Developmental Neuroscience, twice Revista Facultad de Ingenieria, Biogeosciences, Diabetes Primary Care, and SMT Surface Mount Technology Magazine, and a number of journals that focus on statistics, mathematics and methodology.
Either this is a weird test whether the scientific community actually looks at the underlying data of a study or this is the worst research on “social sciences” that I have ever seen.