Skip to main content
eScholarship
Open Access Publications from the University of California

UCSF

UC San Francisco Previously Published Works bannerUCSF

Harmonization-Information Trade-Offs for Sharing Individual Participant Data in Biomedicine

Abstract

Biomedical practice is evidence-based. Peer-reviewed papers are the primary medium to present evidence and data-supported results to drive clinical practice. However, it could be argued that scientific literature does not contain data, but rather narratives about and summaries of data. Meta-analyses of published literature may produce biased conclusions due to the lack of transparency in data collection, publication bias, and inaccessibility to the data underlying a publication ('dark data'). Co-analysis of pooled data at the level of individual research participants can offer higher levels of evidence, but this requires that researchers share raw individual participant data (IPD). FAIR (findable, accessible, interoperable, and reusable) data governance principles aim to guide data lifecycle management by providing a framework for actionable data sharing. Here we discuss the implications of FAIR for data harmonization, an essential step for pooling data for IPD analysis. We describe the harmonization-information trade-off, which states that the level of granularity in harmonizing data determines the amount of information lost. Finally, we discuss a framework for managing the trade-off and the levels of harmonization. In the coming era of funder mandates for data sharing, research communities that effectively manage data harmonization will be empowered to harness big data and advanced analytics such as machine learning and artificial intelligence tools, leading to stunning new discoveries that augment our understanding of diseases and their treatments. By elevating scientific data to the status of a first-class citizen of the scientific enterprise, there is strong potential for biomedicine to transition from a narrative publication product orientation to a modern data-driven enterprise where data itself is viewed as a primary work product of biomedical research.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View