Integrative Analysis of Genomic and Transcriptomic Data in Taiwanese Lung Adenocarcinomas
- Author(s): Ho, Hao;
- Advisor(s): Li, Ker-Chau;
- et al.
In this thesis, we studied genomic and transcriptomic data from over 300 Taiwanese lung cancer patients. For structural variation analysis, we proposed a workflow to detect inter-chromosomal structural variation using whole genome sequencing data and introduced an integrated ESP plot for the visualization. We studied somatic DNA alterations and constructed a comprehensive landscape in Taiwanese lung adenocarcinomas by whole exome sequencing and array CGH data. At the single nucleotide level, we identified non-synonymous recurrent point mutations using a binomial probability model. The potential clinical relevance was demonstrated by a survival analysis of patients' relapse-free survival. Mutation variant allele frequency was integrated for improving prognosis power. When exploring the potential downstream, we identified a miRNA expression correlated with these recurrent point mutations. In the study of differential gene expressions between EGFR mutant and wild-type tumors. We derived a statistical framework that combines differential expression analysis and differential regulation analysis to form an enrichment test for identifying critical regulator on the cis-regulatory network. A modified liquid association was introduced for quantifying the change of co-variations in the differential regulation analysis. By integrating copy number, miRNA expression and gene expression data, several key regulators and their cis-targets were identified and visualized together as a network. For a statistical issue of liquid association, we discussed the effects of ignoring background variables to the liquid association scoring method and proposed adjustment methods to marginalize their influence.