Search

Thesis
Peer Reviewed

Tools for Extracting Actionable Medical Knowledge from Genomic Big Data

UC Santa Cruz Electronic Theses and Dissertations (2013)

Cancer is an ideal target for personal genomics-based medicine that uses high-throughput genome assays such as DNA sequencing, RNA sequencing, and expression analysis (collectively called omics); however, researchers and physicians are overwhelmed by the quantities of big data from these assays and cannot interpret this information accurately without specialized tools. To address this problem, I have created software methods and tools called OCCAM (OmiC data Cancer Analytic Model) and DIPSC (Differential Pathway Signature Correlation) for automatically extracting knowledge from this data and turning it into an actionable knowledge base called the activitome. An activitome signature measures a mutation's effect on the cellular molecular pathway. As well, activitome signatures can also be computed for clinical phenotypes. By comparing the vectors of activitome signatures of different mutations and clinical outcomes, intrinsic relationships between these events may be uncovered. OCCAM identifies activitome signatures that can be used to guide the development and application of therapies. DIPSC overcomes the confounding problem of correlating multiple activitome signatures from the same set of samples. In addition, to support the collection of this big data, I have developed MedBook, a federated distributed social network designed for a medical research and decision support system. OCCAM and DIPSC are two of the many apps that will operate inside of MedBook. MedBook extends the Galaxy system with a signature database, an end-user oriented application platform, a rich data medical knowledge- publishing model, and the Biomedical Evidence Graph (BMEG). The goal of MedBook is to improve the outcomes by learning from every patient.

Cover page: Tools for Extracting Actionable Medical Knowledge from Genomic Big Data

Article
Peer Reviewed

Barriers to accessing public cancer genomic data.

UC Santa Cruz Previously Published Works (2019)

Although increasingly recognized as critical to genomic research, genomic data sharing is hindered by an absence of standards regarding timing, patient privacy, use agreement standards, and data characterization and quality. Only after months of identifying, permissioning for use, committing to terms restricting use and sharing, downloading, and assessing quality, is it possible to know whether or not a dataset can be used. In this paper, we evaluate the barriers to data sharing based on the Treehouse experience and offer recommendations for use agreement standards, data characterization and metadata standardization to enhance data sharing and outcomes for all pediatric cancer patients.

Cover page: Barriers to accessing public cancer genomic data.

Article
Peer Reviewed

TumorMap: Exploring the Molecular Similarities of Cancer Samples in an Interactive Portal

UC Santa Cruz Previously Published Works (2017)

Vast amounts of molecular data are being collected on tumor samples, which provide unique opportunities for discovering trends within and between cancer subtypes. Such cross-cancer analyses require computational methods that enable intuitive and interactive browsing of thousands of samples based on their molecular similarity. We created a portal called TumorMap to assist in exploration and statistical interrogation of high-dimensional complex "omics" data in an interactive and easily interpretable way. In the TumorMap, samples are arranged on a hexagonal grid based on their similarity to one another in the original genomic space and are rendered with Google's Map technology. While the important feature of this public portal is the ability for the users to build maps from their own data, we pre-built genomic maps from several previously published projects. We demonstrate the utility of this portal by presenting results obtained from The Cancer Genome Atlas project data. Cancer Res; 77(21); e111-4. ©2017 AACR.

Cover page: TumorMap: Exploring the Molecular Similarities of Cancer Samples in an Interactive Portal

Article
Peer Reviewed

Barriers to accessing public cancer genomic data

UC Santa Cruz Previously Published Works (2019)

Although increasingly recognized as critical to genomic research, genomic data sharing is hindered by an absence of standards regarding timing, patient privacy, use agreement standards, and data characterization and quality. Only after months of identifying, permissioning for use, committing to terms restricting use and sharing, downloading, and assessing quality, is it possible to know whether or not a dataset can be used. In this paper, we evaluate the barriers to data sharing based on the Treehouse experience and offer recommendations for use agreement standards, data characterization and metadata standardization to enhance data sharing and outcomes for all pediatric cancer patients.

Article
Peer Reviewed

Protected Health Information filter (Philter): accurately and securely de-identifying free-text clinical notes

UC San Francisco Previously Published Works (2020)

There is a great and growing need to ascertain what exactly is the state of a patient, in terms of disease progression, actual care practices, pathology, adverse events, and much more, beyond the paucity of data available in structured medical record data. Ascertaining these harder-to-reach data elements is now critical for the accurate phenotyping of complex traits, detection of adverse outcomes, efficacy of off-label drug use, and longitudinal patient surveillance. Clinical notes often contain the most detailed and relevant digital information about individual patients, the nuances of their diseases, the treatment strategies selected by physicians, and the resulting outcomes. However, notes remain largely unused for research because they contain Protected Health Information (PHI), which is synonymous with individually identifying data. Previous clinical note de-identification approaches have been rigid and still too inaccurate to see any substantial real-world use, primarily because they have been trained with too small medical text corpora. To build a new de-identification tool, we created the largest manually annotated clinical note corpus for PHI and develop a customizable open-source de-identification software called Philter ("Protected Health Information filter"). Here we describe the design and evaluation of Philter, and show how it offers substantial real-world improvements over prior methods.

Cover page: Protected Health Information filter (Philter): accurately and securely de-identifying free-text clinical notes

Article
Peer Reviewed

PatientExploreR: an extensible application for dynamic visualization of patient clinical history from Electronic Health Records in the OMOP Common Data Model Title

UC San Francisco Previously Published Works (2019)

Motivation

Electronic health records (EHRs) are quickly becoming omnipresent in healthcare, but interoperability issues and technical demands limit their use for biomedical and clinical research. Interactive and flexible software that interfaces directly with EHR data structured around a common data model (CDM) could accelerate more EHR-based research by making the data more accessible to researchers who lack computational expertise and/or domain knowledge.

Results

We present PatientExploreR, an extensible application built on the R/Shiny framework that interfaces with a relational database of EHR data in the Observational Medical Outcomes Partnership CDM format. PatientExploreR produces patient-level interactive and dynamic reports and facilitates visualization of clinical data without any programming required. It allows researchers to easily construct and export patient cohorts from the EHR for analysis with other software. This application could enable easier exploration of patient-level data for physicians and researchers. PatientExploreR can incorporate EHR data from any institution that employs the CDM for users with approved access. The software code is free and open source under the MIT license, enabling institutions to install and users to expand and modify the application for their own purposes.

Availability and implementation

PatientExploreR can be freely obtained from GitHub: https://github.com/BenGlicksberg/PatientExploreR. We provide instructions for how researchers with approved access to their institutional EHR can use this package. We also release an open sandbox server of synthesized patient data for users without EHR access to explore: http://patientexplorer.ucsf.edu.

Supplementary information

Supplementary data are available at Bioinformatics online.

Cover page: PatientExploreR: an extensible application for dynamic visualization of patient clinical history from Electronic Health Records in the OMOP Common Data Model Title

Article
Peer Reviewed

Dysregulation of hsa-miR-34a and hsa-miR-449a leads to overexpression of PACS-1 and loss of DNA damage response (DDR) in cervical cancer

UCLA Previously Published Works (2020)

We have observed overexpression of PACS-1, a cytosolic sorting protein in primary cervical tumors. Absence of exonic mutations and overexpression at the RNA level suggested a transcriptional and/or posttranscriptional regulation. University of California Santa Cruz genome browser analysis of PACS-1 micro RNAs (miR), revealed two 8-base target sequences at the 3' terminus for hsa-miR-34a and hsa-miR-449a. Quantitative RT-PCR and Northern blotting studies showed reduced or loss of expression of the two microRNAs in cervical cancer cell lines and primary tumors, indicating dysregulation of these two microRNAs in cervical cancer. Loss of PACS-1 with siRNA or exogenous expression of hsa-miR-34a or hsa-miR-449a in HeLa and SiHa cervical cancer cell lines resulted in DNA damage response, S-phase cell cycle arrest, and reduction in cell growth. Furthermore, the siRNA studies showed that loss of PACS-1 expression was accompanied by increased nuclear γH2AX expression, Lys³⁸²-p53 acetylation, and genomic instability. PACS-1 re-expression through LNA-hsa-anti-miR-34a or -449a or through PACS-1 cDNA transfection led to the reversal of DNA damage response and restoration of cell growth. Release of cells post 24-h serum starvation showed PACS-1 nuclear localization at G₁-S phase of the cell cycle. Our results therefore indicate that the loss of hsa-miR-34a and hsa-miR-449a expression in cervical cancer leads to overexpression of PACS-1 and suppression of DNA damage response, resulting in the development of chemo-resistant tumors.

Cover page: Dysregulation of hsa-miR-34a and hsa-miR-449a leads to overexpression of PACS-1 and loss of DNA damage response (DDR) in cervical cancer

Article
Peer Reviewed

MEK-ERK signaling is a therapeutic target in metastatic castration resistant prostate cancer

UCLA Previously Published Works (2019)

Background

Metastatic castration resistant prostate cancer (mCRPC) is incurable and progression after drugs that target the androgen receptor-signaling axis is inevitable. Thus, there is an urgent need to develop more effective treatments beyond hormonal manipulation. We sought to identify activated kinases in mCRPC as therapeutic targets for existing, approved agents, with the goal of identifying candidate drugs for rapid translation into proof of concept Phase II trials in mCRPC.

Methods

To identify evidence of activation of druggable kinases in these patients, we compared mRNA expression from metastatic biopsies of patients with mCRPC (n = 101) to mRNA expression in localized prostate from TCGA and used this analysis to infer differential kinase activity. In addition, we assessed the differential phosphorylation levels for key MAPK pathway kinases between mCRPC and localized prostate cancers.

Results

Transcriptomic profiling of 101 patients with mCRPC as compared to patients with localized prostate cancer identified evidence of hyperactive ERK1, and whole genome sequencing revealed frequent amplifications of members of the MAPK pathway in 32% of this cohort. Next, we confirmed elevated levels of phosphorylated ERK1/2 in castration resistant prostate cancer as compared to untreated primary prostate cancer. We observed that the presence of detectable phosphorylated ERK1/2 in the primary tumor is associated with biochemical failure after radical prostatectomy independent of clinicopathologic features. ERK1 is the immediate downstream target of MEK1/2, which is druggable with trametinib, an approved therapeutic for melanoma. Trametinib elicited a profound biochemical and clinical response in a patient who had failed multiple prior treatments for mCRPC.

Conclusions

We conclude that pharmacologic targeting of the MEK/ERK pathway may be a viable treatment strategy for patients with refractory metastatic prostate cancer. An ongoing Phase II trial tests this hypothesis.

Article
Peer Reviewed

Whole-genome analysis informs breast cancer response to aromatase inhibition

UC San Francisco Previously Published Works (2012)

To correlate the variable clinical features of oestrogen-receptor-positive breast cancer with somatic alterations, we studied pretreatment tumour biopsies accrued from patients in two studies of neoadjuvant aromatase inhibitor therapy by massively parallel sequencing and analysis. Eighteen significantly mutated genes were identified, including five genes (RUNX1, CBFB, MYH9, MLL3 and SF3B1) previously linked to haematopoietic disorders. Mutant MAP3K1 was associated with luminal A status, low-grade histology and low proliferation rates, whereas mutant TP53 was associated with the opposite pattern. Moreover, mutant GATA3 correlated with suppression of proliferation upon aromatase inhibitor treatment. Pathway analysis demonstrated that mutations in MAP2K4, a MAP3K1 substrate, produced similar perturbations as MAP3K1 loss. Distinct phenotypes in oestrogen-receptor-positive breast cancer are associated with specific patterns of somatic mutations that map into cellular pathways linked to tumour biology, but most recurrent mutations are relatively infrequent. Prospective clinical trials based on these findings will require comprehensive genome sequencing.

Cover page: Whole-genome analysis informs breast cancer response to aromatase inhibition

Article
Peer Reviewed

PDX-MI: Minimal Information for Patient-Derived Tumor Xenograft Models

UC San Francisco Previously Published Works (2017)

Patient-derived tumor xenograft (PDX) mouse models have emerged as an important oncology research platform to study tumor evolution, mechanisms of drug response and resistance, and tailoring chemotherapeutic approaches for individual patients. The lack of robust standards for reporting on PDX models has hampered the ability of researchers to find relevant PDX models and associated data. Here we present the PDX models minimal information standard (PDX-MI) for reporting on the generation, quality assurance, and use of PDX models. PDX-MI defines the minimal information for describing the clinical attributes of a patient's tumor, the processes of implantation and passaging of tumors in a host mouse strain, quality assurance methods, and the use of PDX models in cancer research. Adherence to PDX-MI standards will facilitate accurate search results for oncology models and their associated data across distributed repository databases and promote reproducibility in research studies using these models. Cancer Res; 77(21); e62-66. ©2017 AACR.

Cover page: PDX-MI: Minimal Information for Patient-Derived Tumor Xenograft Models