BOB REED: The Problem With Open Data: Would Requiring Co-Authorship Help?

There has been a huge amount of attention focused on “open data.”  A casual reading of the blogosphere is that Open Data is good, Secret Data is bad.  
Remarkably, there has been very little discussion given to the property right issues associated with open data.  The Open Data Movement wants to turn a private good (datasets) into a public good.  Economists know something about public goods.  They tend to get under-produced.  This introduces a trade-off between the propagation of data for use by multiple researchers, a social good (though see here for a discussion where this is not necessarily so), versus the disincentive this causes for producing data, a social bad.  How best to make this trade-off is unclear.  
In a recent blog entitled “Open data, authorship, and the early career scientist”, MARGARET KOSMALA, a postdoctoral fellow at Harvard University, argues that making one’s data available to others hurts the data-producing scholar, particularly younger scholars.  The argument is not so much that the data-producing scholar will be scooped by other scientists on the associated research.  Rather, it is that subsequent research projects that could have resulted in publications for the data-producing scholar will end up being undertaken by other scientists.  And while Kosmala does not make this point explicitly, this serves as a disincentive for scientists to produce data, if only because  younger scholars may not be able to produce sufficient publications to get the funding and tenure they need to continue their careers.
What is really interesting about this blog is that it led to a discussion between a reader and the author about the ethics of “requiring co-authorship” when authors use data produced by another scientist.  Missing from the discussion was the recognition that “requiring co-authorship” provides a potential solution to the problem of open data.  It is a way for the data-producing scientist to reap the rewards of data production, while still allowing other authors to use it.  
Of course, there are issues associated with implementing a policy like this.  Once data are released, how will the data-producing author be able to ensure that others who use the data will extend co-authorship to him/her?  And suppose the data-producing author does not wish to have their data used in a certain way.  Should they have the right to restrict its use?  While the answers are debatable, the questions are illuminating, because they make us realise that the debate over open data is just another application of the larger subject of intellectual property rights.



Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: