Search

Thesis
Peer Reviewed

Analysis of genomic variants via gene networks

Hofree, Matan
Advisor(s): Ideker, Trey

UC San Diego Electronic Theses and Dissertations (2014)

Genome-wide measurements of genomic state offer unprecedented opportunities for biological discovery, with potential to make dramatic impact on medicine and life. One fundamental challenge is associating complex phenotypes with genetic cause. Here, I will describe efforts to advance solutions to this challenge via analysis of gene networks.

Genome-wide association studies are designed link between a phenotype and genomic loci anywhere in the genome; however, applying standard statistics to such data has fallen far short of building accurate predictive models for disease. We use Adaboost, a large-margin classification algorithm, to predict disease status in two cohorts of diabetes and suggest a method for overcoming limitations arising from correlation between genetic variants. We uncover a novel set of 163 disease-associations, missed by `classic' statistics.

Classification of cancer remains predominantly organ based and fails to account for considerable heterogeneity of outcomes. Tumor genomes provide a new source of data for uncovering subtypes, but are difficult to compare, as tumors share few mutations in common. We introduce network-based stratification (NBS), a method for integrating somatic genomes with networks encoding biological knowledge. This allows for identification of cancer subtypes by clustering tumors with mutations in similar network regions. We demonstrate NBS in multiple cancer cohorts, identifying subtypes predictive of clinical features and outcomes, and highlighting sub-networks characteristic of each.

Current approaches for identifying cancer genes rely on the idea that particular perturbations, occurring in a subset of genes unique to each cancer type, are selected for by conferring a survival advantage to tumor cells. Such genes are expected to be enriched for mutations when examined across a population. Here we show that 30-50% of well-known cancer genes are not significantly elevated in mutation frequency. Despite this lack of enrichment, known cancer genes are enriched for mutations causing changes in amino-acid composition, protein structure properties and conservation. Furthermore, we observe 15-30% of cancer genes have altered mutation rates conditioned on other genes, each individually spanning the range of single-gene mutation frequencies, implicating a large genetic interaction network underlying human cancer. This suggests a substantial number of cancer genes will never be identified by frequency alone.

1 supplemental file

Cover page: Analysis of genomic variants via gene networks

Article
Peer Reviewed

Challenges in identifying cancer genes by analysis of exome sequencing data

UC San Diego Previously Published Works (2016)

Massively parallel sequencing has permitted an unprecedented examination of the cancer exome, leading to predictions that all genes important to cancer will soon be identified by genetic analysis of tumours. To examine this potential, here we evaluate the ability of state-of-the-art sequence analysis methods to specifically recover known cancer genes. While some cancer genes are identified by analysis of recurrence, spatial clustering or predicted impact of somatic mutations, many remain undetected due to lack of power to discriminate driver mutations from the background mutational load (13-60% recall of cancer genes impacted by somatic single-nucleotide variants, depending on the method). Cancer genes not detected by mutation recurrence also tend to be missed by all types of exome analysis. Nonetheless, these genes are implicated by other experiments such as functional genetic screens and expression profiling. These challenges are only partially addressed by increasing sample size and will likely hold even as greater numbers of tumours are analysed.

Cover page: Challenges in identifying cancer genes by analysis of exome sequencing data

Article
Peer Reviewed

Network-based stratification of tumor mutations

UC San Diego Previously Published Works (2013)

Many forms of cancer have multiple subtypes with different causes and clinical outcomes. Somatic tumor genome sequences provide a rich new source of data for uncovering these subtypes but have proven difficult to compare, as two tumors rarely share the same mutations. Here we introduce network-based stratification (NBS), a method to integrate somatic tumor genomes with gene networks. This approach allows for stratification of cancer into informative subtypes by clustering together patients with mutations in similar network regions. We demonstrate NBS in ovarian, uterine and lung cancer cohorts from The Cancer Genome Atlas. For each tissue, NBS identifies subtypes that are predictive of clinical outcomes such as patient survival, response to therapy or tumor histology. We identify network regions characteristic of each subtype and show how mutation-derived subtypes can be used to train an mRNA expression signature, which provides similar information in the absence of DNA sequence.

Cover page: Network-based stratification of tumor mutations

Article
Peer Reviewed

Massively parallel single-nucleus RNA-seq with DroNc-seq

UC Berkeley Previously Published Works (2017)

Single-nucleus RNA sequencing (sNuc-seq) profiles RNA from tissues that are preserved or cannot be dissociated, but it does not provide high throughput. Here, we develop DroNc-seq: massively parallel sNuc-seq with droplet technology. We profile 39,111 nuclei from mouse and human archived brain samples to demonstrate sensitive, efficient, and unbiased classification of cell types, paving the way for systematic charting of cell atlases.

Cover page: Massively parallel single-nucleus RNA-seq with DroNc-seq

Article
Peer Reviewed

Multi-tiered genomic analysis of head and neck cancer ties TP53 mutation to 3p loss

UC San Diego Previously Published Works (2014)

Head and neck squamous cell carcinoma (HNSCC) is characterized by aggressive behavior with a propensity for metastasis and recurrence. Here we report a comprehensive analysis of the molecular and clinical features of HNSCC that govern patient survival. We find that TP53 mutation is frequently accompanied by loss of chromosome 3p and that the combination of these events is associated with a surprising decrease in survival time (1.9 years versus >5 years for TP53 mutation alone). The TP53-3p interaction is specific to chromosome 3p and validates in HNSCC and pan-cancer cohorts. In human papillomavirus (HPV)-positive tumors, in which HPV inactivates TP53, 3p deletion is also common and is associated with poor outcomes. The TP53-3p event is modified by mir-548k expression, which decreases survival further, and is mutually exclusive with mutations affecting RAS signaling. Together, the identified markers underscore the molecular heterogeneity of HNSCC and enable a new multi-tiered classification of this disease.

Cover page: Multi-tiered genomic analysis of head and neck cancer ties TP53 mutation to 3p loss

Article
Peer Reviewed

Interaction Landscape of Inherited Polymorphisms with Somatic Events in Cancer

UC San Diego Previously Published Works (2017)

Recent studies have characterized the extensive somatic alterations that arise during cancer. However, the somatic evolution of a tumor may be significantly affected by inherited polymorphisms carried in the germline. Here, we analyze genomic data for 5,954 tumors to reveal and systematically validate 412 genetic interactions between germline polymorphisms and major somatic events, including tumor formation in specific tissues and alteration of specific cancer genes. Among germline-somatic interactions, we found germline variants in RBFOX1 that increased incidence of SF3B1 somatic mutation by 8-fold via functional alterations in RNA splicing. Similarly, 19p13.3 variants were associated with a 4-fold increased likelihood of somatic mutations in PTEN. In support of this association, we found that PTEN knockdown sensitizes the MTOR pathway to high expression of the 19p13.3 gene GNA11 Finally, we observed that stratifying patients by germline polymorphisms exposed distinct somatic mutation landscapes, implicating new cancer genes. This study creates a validated resource of inherited variants that govern where and how cancer develops, opening avenues for prevention research.Significance: This study systematically identifies germline variants that directly affect tumor evolution, either by dramatically increasing alteration frequency of specific cancer genes or by influencing the site where a tumor develops. Cancer Discovery; 7(4); 410-23. ©2017 AACR.See related commentary by Geeleher and Huang, p. 354This article is highlighted in the In This Issue feature, p. 339.

Cover page: Interaction Landscape of Inherited Polymorphisms with Somatic Events in Cancer

Article
Peer Reviewed

Multimodal single-cell and whole-genome sequencing of small, frozen clinical specimens

UCLA Previously Published Works (2023)

Single-cell genomics enables dissection of tumor heterogeneity and molecular underpinnings of drug response at an unprecedented resolution^1-11. However, broad clinical application of these methods remains challenging, due to several practical and preanalytical challenges that are incompatible with typical clinical care workflows, namely the need for relatively large, fresh tissue inputs. In the present study, we show that multimodal, single-nucleus (sn)RNA/T cell receptor (TCR) sequencing, spatial transcriptomics and whole-genome sequencing (WGS) are feasible from small, frozen tissues that approximate routinely collected clinical specimens (for example, core needle biopsies). Compared with data from sample-matched fresh tissue, we find a similar quality in the biological outputs of snRNA/TCR-seq data, while reducing artifactual signals and compositional biases introduced by fresh tissue processing. Profiling sequentially collected melanoma samples from a patient treated in the KEYNOTE-001 trial¹², we resolved cellular, genomic, spatial and clonotype dynamics that represent molecular patterns of heterogeneous intralesional evolution during anti-programmed cell death protein 1 therapy. To demonstrate applicability to banked biospecimens of rare diseases¹³, we generated a single-cell atlas of uveal melanoma liver metastasis with matched WGS data. These results show that single-cell genomics from archival, clinical specimens is feasible and provides a framework for translating these methods more broadly to the clinical arena.

Cover page: Multimodal single-cell and whole-genome sequencing of small, frozen clinical specimens

Article
Peer Reviewed

Using Functional Signature Ontology (FUSION) to Identify Mechanisms of Action for Natural Products

UC Santa Cruz Previously Published Works (2013)

A challenge for biomedical research is the development of pharmaceuticals that appropriately target disease mechanisms. Natural products can be a rich source of bioactive chemicals for medicinal applications but can act through unknown mechanisms and can be difficult to produce or obtain. To address these challenges, we developed a new marine-derived, renewable natural products resource and a method for linking bioactive derivatives of this library to the proteins and biological processes that they target in cells. We used cell-based screening and computational analysis to match gene expression signatures produced by natural products to those produced by small interfering RNA (siRNA) and synthetic microRNA (miRNA) libraries. With this strategy, we matched proteins and miRNAs with diverse biological processes and also identified putative protein targets and mechanisms of action for several previously undescribed marine-derived natural products. We confirmed mechanistic relationships for selected siRNAs, miRNAs, and compounds with functional roles in autophagy, chemotaxis mediated by discoidin domain receptor 2, or activation of the kinase AKT. Thus, this approach may be an effective method for screening new drugs while simultaneously identifying their targets.

Cover page: Using Functional Signature Ontology (FUSION) to Identify Mechanisms of Action for Natural Products

Article
Peer Reviewed

Inactivation of Capicua drives cancer metastasis

UC San Francisco Previously Published Works (2017)

Metastasis is the leading cause of death in people with lung cancer, yet the molecular effectors underlying tumor dissemination remain poorly defined. Through the development of an in vivo spontaneous lung cancer metastasis model, we show that the developmentally regulated transcriptional repressor Capicua (CIC) suppresses invasion and metastasis. Inactivation of CIC relieves repression of its effector ETV4, driving ETV4-mediated upregulation of MMP24, which is necessary and sufficient for metastasis. Loss of CIC, or an increase in levels of its effectors ETV4 and MMP24, is a biomarker of tumor progression and worse outcomes in people with lung and/or gastric cancer. Our findings reveal CIC as a conserved metastasis suppressor, highlighting new anti-metastatic strategies that could potentially improve patient outcomes.

Cover page: Inactivation of Capicua drives cancer metastasis

Article
Peer Reviewed

Synthetic Essentiality of Metabolic Regulator PDHK1 in PTEN-Deficient Cells and Cancers

UC San Francisco Previously Published Works (2019)

Phosphatase and tensin homolog deleted on chromosome 10 (PTEN) is a tumor suppressor and bi-functional lipid and protein phosphatase. We report that the metabolic regulator pyruvate dehydrogenase kinase1 (PDHK1) is a synthetic-essential gene in PTEN-deficient cancer and normal cells. The PTEN protein phosphatase dephosphorylates nuclear factor κB (NF-κB)-activating protein (NKAP) and limits NFκB activation to suppress expression of PDHK1, a NF-κB target gene. Loss of the PTEN protein phosphatase upregulates PDHK1 to induce aerobic glycolysis and PDHK1 cellular dependence. PTEN-deficient human tumors harbor increased PDHK1, a biomarker of decreased patient survival. This study uncovers a PTEN-regulated signaling pathway and reveals PDHK1 as a potential target in PTEN-deficient cancers.

Cover page: Synthetic Essentiality of Metabolic Regulator PDHK1 in PTEN-Deficient Cells and Cancers