Why Data Sharing and Reuse Are Hard To Do
Researchers are producing an unprecedented amount of data by using new methods and instrumentation. By accessing and reusing these data, scientists can answer complex research problems that need systemic approaches to knowledge discovery. However, research data are often not readily available, and even when data are shared, they cannot be reused outside their original context of production. Based on our studies of data practices in science, we compare data sharing and reuse challenges faced by researchers in life sciences, oceanography, astronomy, molecular biology, and genomics. Data sharing difficulties include determining what to release, when, in what format, and by what means. Data reuse challenges include determining what data could be reused, by whom (expertise required), with whom (collaborative environments), under what conditions (issues of data quality and curation), why (needs for data integration, control a nd comparison), and to what effects (types of analysis).