Skip to main content
eScholarship
Open Access Publications from the University of California

UCSF

UC San Francisco Previously Published Works bannerUCSF

Precision annotation of digital samples in NCBI's gene expression omnibus.

  • Author(s): Hadley, Dexter
  • Pan, James
  • El-Sayed, Osama
  • Aljabban, Jihad
  • Aljabban, Imad
  • Azad, Tej D
  • Hadied, Mohamad O
  • Raza, Shuaib
  • Rayikanti, Benjamin Abhishek
  • Chen, Bin
  • Paik, Hyojung
  • Aran, Dvir
  • Spatz, Jordan
  • Himmelstein, Daniel
  • Panahiazar, Maryam
  • Bhattacharya, Sanchita
  • Sirota, Marina
  • Musen, Mark A
  • Butte, Atul J
  • et al.
Abstract

The Gene Expression Omnibus (GEO) contains more than two million digital samples from functional genomics experiments amassed over almost two decades. However, individual sample meta-data remains poorly described by unstructured free text attributes preventing its largescale reanalysis. We introduce the Search Tag Analyze Resource for GEO as a web application (http://STARGEO.org) to curate better annotations of sample phenotypes uniformly across different studies, and to use these sample annotations to define robust genomic signatures of disease pathology by meta-analysis. In this paper, we target a small group of biomedical graduate students to show rapid crowd-curation of precise sample annotations across all phenotypes, and we demonstrate the biological validity of these crowd-curated annotations for breast cancer. STARGEO.org makes GEO data findable, accessible, interoperable and reusable (i.e., FAIR) to ultimately facilitate knowledge discovery. Our work demonstrates the utility of crowd-curation and interpretation of open 'big data' under FAIR principles as a first step towards realizing an ideal paradigm of precision medicine.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
Current View