Search

Article
Peer Reviewed

Optimization of miRNA-seq data preprocessing

UC Davis Previously Published Works (2015)

The past two decades of microRNA (miRNA) research has solidified the role of these small non-coding RNAs as key regulators of many biological processes and promising biomarkers for disease. The concurrent development in high-throughput profiling technology has further advanced our understanding of the impact of their dysregulation on a global scale. Currently, next-generation sequencing is the platform of choice for the discovery and quantification of miRNAs. Despite this, there is no clear consensus on how the data should be preprocessed before conducting downstream analyses. Often overlooked, data preprocessing is an essential step in data analysis: the presence of unreliable features and noise can affect the conclusions drawn from downstream analyses. Using a spike-in dilution study, we evaluated the effects of several general-purpose aligners (BWA, Bowtie, Bowtie 2 and Novoalign), and normalization methods (counts-per-million, total count scaling, upper quartile scaling, Trimmed Mean of M, DESeq, linear regression, cyclic loess and quantile) with respect to the final miRNA count data distribution, variance, bias and accuracy of differential expression analysis. We make practical recommendations on the optimal preprocessing methods for the extraction and interpretation of miRNA count data from small RNA-sequencing experiments.

Cover page: Optimization of miRNA-seq data preprocessing

Article
Peer Reviewed

Exploiting the noise: improving biomarkers with ensembles of data analysis methodologies

UCLA Previously Published Works (2012)

Background

The advent of personalized medicine requires robust, reproducible biomarkers that indicate which treatment will maximize therapeutic benefit while minimizing side effects and costs. Numerous molecular signatures have been developed over the past decade to fill this need, but their validation and up-take into clinical settings has been poor. Here, we investigate the technical reasons underlying reported failures in biomarker validation for non-small cell lung cancer (NSCLC).

Methods

We evaluated two published prognostic multi-gene biomarkers for NSCLC in an independent 442-patient dataset. We then systematically assessed how technical factors influenced validation success.

Results

Both biomarkers validated successfully (biomarker #1: hazard ratio (HR) 1.63, 95% confidence interval (CI) 1.21 to 2.19, P = 0.001; biomarker #2: HR 1.42, 95% CI 1.03 to 1.96, P = 0.030). Further, despite being underpowered for stage-specific analyses, both biomarkers successfully stratified stage II patients and biomarker #1 also stratified stage IB patients. We then systematically evaluated reasons for reported validation failures and find they can be directly attributed to technical challenges in data analysis. By examining 24 separate pre-processing techniques we show that minor alterations in pre-processing can change a successful prognostic biomarker (HR 1.85, 95% CI 1.37 to 2.50, P < 0.001) into one indistinguishable from random chance (HR 1.15, 95% CI 0.86 to 1.54, P = 0.348). Finally, we develop a new method, based on ensembles of analysis methodologies, to exploit this technical variability to improve biomarker robustness and to provide an independent confidence metric.

Conclusions

Biomarkers comprise a fundamental component of personalized medicine. We first validated two NSCLC prognostic biomarkers in an independent patient cohort. Power analyses demonstrate that even this large, 442-patient cohort is under-powered for stage-specific analyses. We then use these results to discover an unexpected sensitivity of validation to subtle data analysis decisions. Finally, we develop a novel algorithmic approach to exploit this sensitivity to improve biomarker robustness.

Cover page: Exploiting the noise: improving biomarkers with ensembles of data analysis methodologies

Article
Peer Reviewed

Early innate and adaptive immune perturbations determine long-term severity of chronic virus and Mycobacterium tuberculosis coinfection

UCLA Previously Published Works (2021)

Chronic viral infections increase severity of Mycobacterium tuberculosis (Mtb) coinfection. Here, we examined how chronic viral infections alter the pulmonary microenvironment to foster coinfection and worsen disease severity. We developed a coordinated system of chronic virus and Mtb infection that induced central clinical manifestations of coinfection, including increased Mtb burden, extra-pulmonary dissemination, and heightened mortality. These disease states were not due to chronic virus-induced immunosuppression or exhaustion; rather, increased amounts of the cytokine TNFα initially arrested pulmonary Mtb growth, impeding dendritic cell mediated antigen transportation to the lymph node and subverting immune-surveillance, allowing bacterial sanctuary. The cryptic Mtb replication delayed CD4 T cell priming, redirecting T helper (Th) 1 toward Th17 differentiation and increasing pulmonary neutrophilia, which diminished long-term survival. Temporally restoring CD4 T cell induction overcame these diverse disease sequelae to enhance Mtb control. Thus, Mtb co-opts TNFα from the chronic inflammatory environment to subvert immune-surveillance, avert early immune function, and foster long-term coinfection.

Creative Commons 'BY' version 4.0 license

Article
Peer Reviewed

Uncovering the underlying immune perturbations that determine long-term severity of chronic virus and Mycobacterium tuberculosis coinfection

UCLA Previously Published Works (2021)

Article
Peer Reviewed

The Role of Cancer-Testis Antigens as Predictive and Prognostic Markers in Non-Small Cell Lung Cancer

UCLA Previously Published Works (2013)

Background

Cancer-Testis Antigens (CTAs) are immunogenic proteins that are poor prognostic markers in non-small cell lung cancer (NSCLC). We investigated expression of CTAs in NSCLC and their association with response to chemotherapy, genetic mutations and survival.

Methods

We studied 199 patients with pathological N2 NSCLC treated with neoadjuvant chemotherapy (NAC; n = 94), post-operative observation (n = 49), adjuvant chemotherapy (n = 47) or unknown (n = 9). Immunohistochemistry for NY-ESO-1, MAGE-A and MAGE-C1 was performed. Clinicopathological features, response to neoadjuvant treatment and overall survival were correlated. DNA mutations were characterized using the Sequenom Oncocarta panel v1.0. Affymetrix data from the JBR.10 adjuvant chemotherapy study were obtained from a public repository, normalised and mapped for CTAs.

Results

NY-ESO-1 was expressed in 50/199 (25%) samples. Expression of NY-ESO-1 in the NAC cohort was associated with significantly increased response rates (P = 0.03), but not overall survival. In the post-operative cohort, multivariate analyses identified NY-ESO-1 as an independent poor prognostic marker for those not treated with chemotherapy (HR 2.61, 95% CI 1.28-5.33; P = 0.008), whereas treatment with chemotherapy and expression of NY-ESO-1 was an independent predictor of improved survival (HR 0.267, 95% CI 0.07-0.980; P = 0.046). Similar findings for MAGE-A were seen, but did not meet statistical significance. Independent gene expression data from the JBR.10 dataset support these findings but were underpowered to demonstrate significant differences. There was no association between oncogenic mutations and CTA expression.

Conclusions

NY-ESO-1 was predictive of increased response to neoadjuvant chemotherapy and benefit from adjuvant chemotherapy. Further studies investigating the relationship between these findings and immune mechanisms are warranted.

Cover page: The Role of Cancer-Testis Antigens as Predictive and Prognostic Markers in Non-Small Cell Lung Cancer

Article
Peer Reviewed

Appropriateness of Using Patient-Derived Xenograft Models for Pharmacologic Evaluation of Novel Therapies for Esophageal/Gastro-Esophageal Junction Cancers

UCLA Previously Published Works (2015)

The high morbidity and mortality of patients with esophageal (E) and gastro-esophageal junction (GEJ) cancers, warrants new pre-clinical models for drug testing. The utility of primary tumor xenografts (PTXGs) as pre-clinical models was assessed. Clinicopathological, immunohistochemical markers (p53, p16, Ki-67, Her-2/neu and EGFR), and global mRNA abundance profiles were evaluated to determine selection biases of samples implanted or engrafted, compared with the underlying population. Nine primary E/GEJ adenocarcinoma xenograft lines were further characterized for the spectrum and stability of gene/protein expression over passages. Seven primary esophageal adenocarcinoma xenograft lines were treated with individual or combination chemotherapy. Tumors that were implanted (n=55) in NOD/SCID mice had features suggestive of more aggressive biology than tumors that were never implanted (n=32). Of those implanted, 21/55 engrafted; engraftment was associated with poorly differentiated tumors (p=0.04) and older patients (p=0.01). Expression of immunohistochemical markers were similar between patient sample and corresponding xenograft. mRNA differences observed between patient tumors and first passage xenografts were largely due to loss of human stroma in xenografts. mRNA patterns of early vs late passage xenografts and of small vs large tumors of the same passage were similar. Complete resistance was present in 2/7 xenografts while the remaining tumors showed varying degrees of sensitivity, that remained constant across passages. Because of their ability to recapitulate primary tumor characteristics during engraftment and across serial passaging, PTXGs can be useful clinical systems for assessment of drug sensitivity of human E/GEJ cancers.

Cover page: Appropriateness of Using Patient-Derived Xenograft Models for Pharmacologic Evaluation of Novel Therapies for Esophageal/Gastro-Esophageal Junction Cancers

Article
Peer Reviewed

ONECUT2 is a driver of neuroendocrine prostate cancer

UCLA Previously Published Works (2019)

Neuroendocrine prostate cancer (NEPC), a lethal form of the disease, is characterized by loss of androgen receptor (AR) signaling during neuroendocrine transdifferentiation, which results in resistance to AR-targeted therapy. Clinically, genomically and epigenetically, NEPC resembles other types of poorly differentiated neuroendocrine tumors (NETs). Through pan-NET analyses, we identified ONECUT2 as a candidate master transcriptional regulator of poorly differentiated NETs. ONECUT2 ectopic expression in prostate adenocarcinoma synergizes with hypoxia to suppress androgen signaling and induce neuroendocrine plasticity. ONEUCT2 drives tumor aggressiveness in NEPC, partially through regulating hypoxia signaling and tumor hypoxia. Specifically, ONECUT2 activates SMAD3, which regulates hypoxia signaling through modulating HIF1α chromatin-binding, leading NEPC to exhibit higher degrees of hypoxia compared to prostate adenocarcinomas. Treatment with hypoxia-activated prodrug TH-302 potently reduces NEPC tumor growth. Collectively, these results highlight the synergy between ONECUT2 and hypoxia in driving NEPC, and emphasize the potential of hypoxia-directed therapy for NEPC patients.

Cover page: ONECUT2 is a driver of neuroendocrine prostate cancer

Article
Peer Reviewed

Obesity, metabolic factors and risk of different histological types of lung cancer: A Mendelian randomization study

Background

Assessing the relationship between lung cancer and metabolic conditions is challenging because of the confounding effect of tobacco. Mendelian randomization (MR), or the use of genetic instrumental variables to assess causality, may help to identify the metabolic drivers of lung cancer.

Methods and findings

We identified genetic instruments for potential metabolic risk factors and evaluated these in relation to risk using 29,266 lung cancer cases (including 11,273 adenocarcinomas, 7,426 squamous cell and 2,664 small cell cases) and 56,450 controls. The MR risk analysis suggested a causal effect of body mass index (BMI) on lung cancer risk for two of the three major histological subtypes, with evidence of a risk increase for squamous cell carcinoma (odds ratio (OR) [95% confidence interval (CI)] = 1.20 [1.01-1.43] and for small cell lung cancer (OR [95%CI] = 1.52 [1.15-2.00]) for each standard deviation (SD) increase in BMI [4.6 kg/m2]), but not for adenocarcinoma (OR [95%CI] = 0.93 [0.79-1.08]) (Pheterogeneity = 4.3x10-3). Additional analysis using a genetic instrument for BMI showed that each SD increase in BMI increased cigarette consumption by 1.27 cigarettes per day (P = 2.1x10-3), providing novel evidence that a genetic susceptibility to obesity influences smoking patterns. There was also evidence that low-density lipoprotein cholesterol was inversely associated with lung cancer overall risk (OR [95%CI] = 0.90 [0.84-0.97] per SD of 38 mg/dl), while fasting insulin was positively associated (OR [95%CI] = 1.63 [1.25-2.13] per SD of 44.4 pmol/l). Sensitivity analyses including a weighted-median approach and MR-Egger test did not detect other pleiotropic effects biasing the main results.

Conclusions

Our results are consistent with a causal role of fasting insulin and low-density lipoprotein cholesterol in lung cancer etiology, as well as for BMI in squamous cell and small cell carcinoma. The latter relation may be mediated by a previously unrecognized effect of obesity on smoking behavior.

Cover page: Obesity, metabolic factors and risk of different histological types of lung cancer: A Mendelian randomization study

Article
Peer Reviewed

A renewed model of pancreatic cancer evolution based on genomic rearrangement patterns

UC Davis Previously Published Works (2016)

Pancreatic cancer, a highly aggressive tumour type with uniformly poor prognosis, exemplifies the classically held view of stepwise cancer development. The current model of tumorigenesis, based on analyses of precursor lesions, termed pancreatic intraepithelial neoplasm (PanINs) lesions, makes two predictions: first, that pancreatic cancer develops through a particular sequence of genetic alterations (KRAS, followed by CDKN2A, then TP53 and SMAD4); and second, that the evolutionary trajectory of pancreatic cancer progression is gradual because each alteration is acquired independently. A shortcoming of this model is that clonally expanded precursor lesions do not always belong to the tumour lineage, indicating that the evolutionary trajectory of the tumour lineage and precursor lesions can be divergent. This prevailing model of tumorigenesis has contributed to the clinical notion that pancreatic cancer evolves slowly and presents at a late stage. However, the propensity for this disease to rapidly metastasize and the inability to improve patient outcomes, despite efforts aimed at early detection, suggest that pancreatic cancer progression is not gradual. Here, using newly developed informatics tools, we tracked changes in DNA copy number and their associated rearrangements in tumour-enriched genomes and found that pancreatic cancer tumorigenesis is neither gradual nor follows the accepted mutation order. Two-thirds of tumours harbour complex rearrangement patterns associated with mitotic errors, consistent with punctuated equilibrium as the principal evolutionary trajectory. In a subset of cases, the consequence of such errors is the simultaneous, rather than sequential, knockout of canonical preneoplastic genetic drivers that are likely to set-off invasive cancer growth. These findings challenge the current progression model of pancreatic cancer and provide insights into the mutational processes that give rise to these aggressive tumours.

Cover page: A renewed model of pancreatic cancer evolution based on genomic rearrangement patterns

Article
Peer Reviewed

Comprehensive genomic characterization of squamous cell lung cancers

UC San Francisco Previously Published Works (2012)

Lung squamous cell carcinoma is a common type of lung cancer, causing approximately 400,000 deaths per year worldwide. Genomic alterations in squamous cell lung cancers have not been comprehensively characterized, and no molecularly targeted agents have been specifically developed for its treatment. As part of The Cancer Genome Atlas, here we profile 178 lung squamous cell carcinomas to provide a comprehensive landscape of genomic and epigenomic alterations. We show that the tumour type is characterized by complex genomic alterations, with a mean of 360 exonic mutations, 165 genomic rearrangements, and 323 segments of copy number alteration per tumour. We find statistically recurrent mutations in 11 genes, including mutation of TP53 in nearly all specimens. Previously unreported loss-of-function mutations are seen in the HLA-A class I major histocompatibility gene. Significantly altered pathways included NFE2L2 and KEAP1 in 34%, squamous differentiation genes in 44%, phosphatidylinositol-3-OH kinase pathway genes in 47%, and CDKN2A and RB1 in 72% of tumours. We identified a potential therapeutic target in most tumours, offering new avenues of investigation for the treatment of squamous cell lung cancers.

Cover page: Comprehensive genomic characterization of squamous cell lung cancers