NOTE: This entry is based on, “Replication in Labor Economics: Evidence from Data, and What It Suggests,” American Economic Review, 107 (May 2017)
In Hamermesh (2007) I bemoaned the paucity of “hard-science” style replication in applied economics. I shouldn’t have, as my examination of the citation histories of 10 leading articles in empirical labor economics published between 1990 and 1996 shows. Each selected article had to have been published in a so-called “Top 5” journal and to have accumulated at least 1000 Google Scholar (GS) citations. I examined every publication that the Web of Science had recorded as having cited the work, reading first the abstract and then, if necessary, skimming through the citing paper itself. I classified each citing article by whether it was: 1) Related to; 2) Inspired by; 3) Very similar to but using different data; or 4) A direct replication at least partly using the same data.
The distribution of the over 3000 citing papers in the four categories was: Related, 92.9 percent; inspired, 5.0 percent; similar, 1.5 percent; replicated, 0.6 percent. Replication, even defined somewhat loosely, is fairly rare even of these most highly visible studies. However, 7 of the 10 articles were replicated (coded 3 or 4) at least 5 times, with the remaining 3 replicated 1, 2 and 4 times. Published replications of these most heavily-cited papers are performed, so that one might view the replication glass as 100 percent full.
Replications of most studies, even those appearing in Top 5 journals, are not published, nor should they be: The majority of articles in those journals are (Hamermesh, 2017), essentially ignored, so that the failure to replicate them is unimportant. But the most important studies (judged by market responses) are replicated as they should be—by taking the motivating economic idea and examining its implications using a set of data describing a different time and/or economy. The empirical validity of these ideas, after their relevance is first demonstrated for a particular time and place, can only be usefully replicated at other times and places: If they are general descriptions of behavior, they should hold up beyond their original testing ground. Simple laboratory-style replication is important in catching errors in influential work; but the more important replication goes beyond this and, as I’ve shown, is done.
The evidence suggests that the system is not broken and does not need fixing. But what if one believes that more replication, using mostly the same data as in the original study, is necessary? A bit of history: During the 1960s the American Economic Review was replete with replication-like papers, in the form of Comments (often in the form of replications on the same or other data), Replies and even Rejoinders. For example, in the four regular issues of the 1966 volume, 16 percent of the space went to such contributions. In the first four regular issues of the 2013 volume only 4 percent did, reflecting a change that began by the 1980s. Editors shifted away from cluttering the Review’s pages with Comments, etc., perhaps reflecting their desire to maximize its impact on the profession in light of their realization that pages devoted to this type of exercise generate little attention from other authors (Whaples, 2006).
We have had replications or approximations thereof in the past, but the market for scholarship—as indicated by their impact—has exhibited little interest in them. And we still publish replications, but, as I have shown, in the more appropriate and worthwhile form of examinations of data from other times and places. Overall the evidence suggests that the system is not broken and does not need fixing; and that the most obvious way of fixing this unbroken system has already been rejected by the market.
Daniel Hamermesh is Professor of Economics at Royal Holloway, University of London, Research Associate at the National Bureau of Economic Research, and Research Associate at the Institute for the Future of Labor (IZA).
Daniel S. Hamermesh, 2007. “Viewpoint: Replication in Economics.” Canadian Journal of Economics 40 (3): 715-33.
________________, 2017. “Citations in Economics: Measurement, Impacts and Uses.” Journal of Economic Literature, 55, forthcoming.
Robert M. Whaples. 2006. “The Costs of Critical Commentary in Economics Journals.” Econ Journal Watch 3 (2): 275-82
To be classified as “inspired” the citing paper had to refer repeatedly to the original paper and/or had to make clear that it was inspired by the methodology of the original work. To be noted as “similar” the citing paper had to use the exact same methodology but on a different data set, while a study classified as a “replication” went further to include at least some of the data in the original study. Thus even a “replication” in many cases involved more than simply re-estimating models in the original article using the same data.