Lawrence Berkeley National Laboratory
A Measure of Open Data: A Metric and Analysis of Reusable Data Practices in Biomedical Data Resources
- Author(s): Carbon, Seth
- Champieux, Robin
- McMurry, Julie
- Winfree, Lilly
- Wyat, Letisha
- Haendel, Melissa
- et al.
Published Web Locationhttps://www.biorxiv.org/content/early/2018/03/16/282830
ABSTRACT Data is the foundation of science, and there is an increasing focus on how data can be reused and enhanced to drive scientific discoveries. However, most seemingly “open data” do not provide legal permissions for reuse and redistribution. Not being able to integrate and redistribute our collective data resources blocks innovation, and stymies the creation of life-improving diagnostic and drug selection tools. To help the biomedical research and research support communities (e.g. libraries, funders, repositories, etc.) understand and navigate the data licensing landscape, the (Re)usable Data Project (RDP) ( http://reusabledata.org ) assesses the licensing characteristics of data resources and how licensing behaviors impact reuse. We have created a ruleset to determine the reusability of data resources and have applied it to 56 scientific data resources (i.e. databases) to date. The results show significant reuse and interoperability barriers. Inspired by game-changing projects like Creative Commons, the Wikipedia Foundation, and the Free Software movement, we hope to engage the scientific community in the discussion regarding the legal use and reuse of scientific data, including the balance of openness and how to create sustainable data resources in an increasingly competitive environment.