Proteogenomic approach to discover cancer aberrant peptides and antibody peptides using large-scale next-generation sequencing data
Next Generation Sequencing (NGS) and other deep sequencing technologies provide information on transcribed regions, splicing events, and single nucleotide variants in a variety of cellular conditions. The advances of this genomic technologies lead us to have better understanding about the cancer including molecular subtype, cancer progression, and biomarker discovery, but
the complexity, redundancy, and errors in genomic data make it difficult to investigate aberrant genes using only genomic approaches. Combination of proteomic and genomic technologies called proteogenomics are increasingly being employed. Various strategies have been employed to allow the usage of large-scale NGS data, however, serious methodological challenges remain, especially in
the identification of multiple mutational variants, structural variations, or immune system genes even if they play an important role in cancer. This dissertation introduce the integrative proteogenomic method that extends the limit of proteogenomic searches to identify multiple variant peptides as well as immunoglobulin gene variations using advanced RNA-seq read assembly method. The result
provide thousands of aberrant peptides observed in colorectal cancer and extensive characterization of tumor immune response. The result demonstrate the presence of co-expressed pair of antibody and aberrant peptides, and show that these pairs are correlated with survival time of the individuals.