Using Germline Variation to Study Inter-Individual Variability in Cancer Risk and Host Anti-Tumor Immune Response
Cancer is a complex disease driven by genetic variation(1–3). Two main types of genetic variation include germline variation, which is inherited, and somatic variation, which is acquired through environmental exposures and endogenous processes, such as DNA replication. Both types of genetic variation have been critical for precision medicine, tailored to each patient’s individual cancer. Traditionally, germline variation has been used for risk stratification while somatic variation can be used for treatment selection. For patient risk stratification, several genome-wide association studies (GWASs) have been conducted to identify underlying genetic determinants that can predict an individual’s cancer risk. In the GWAS catalog to date, 128,550 associations and over 4,000 publications have been reported(4). Despite the wealth of information provided by GWAS studies, biological insights from these studies are limited and there is still a poor understanding of how inherited variation contributes to cancer behavior.
One major limitation of GWAS studies is the lack of clinical information to understand association. Most large databases only contain basic information, such as demographics and cancer status. However, germline variation can underlie important patterns of tumor behaviors, such as the immune microenvironment, cancer driver mutation frequency and response to therapy. A better understanding of these germline-somatic interactions can help to improve precision medicine efforts. Another major limitation of GWAS studies is lack of diverse genetic cohorts. Genetic cohorts are dominated by individuals of European ancestry and few genetic cohorts of underrepresented populations, such as African-American and Hispanic individuals are exist. In this dissertation, I address these limitations through characterization of germline determinants underlying the tumor immune microenvironment and diverse ancestral populations.
First, in Chapter 1, I identified and characterized germline variants underlying the tumor immune microenvironment in one of the largest adult cancer cohorts, the Cancer Genome Atlas (TCGA). This analysis was motivated by the fact that few biomarkers for immunotherapy exist and a better understanding of germline determinants underlying tumor-immune interactions are needed, especially considering germline variants are present in immune cells along with cancer cells. I identified tumor immune microenvironment SNPs (TIME-SNPs) underlying 157 SNP-heritable immune phenotype components (IP-components) through our own TCGA analysis. Combining these TIME-SNPs with ones collected from literature, we then evaluated cell-type effects of these variants. Finally, we used various bioinformatic pipelines and databases to determine which TIME-SNPs were implicated in cancer risk, survival and ICB response. We validated one of the genes implicated by our TIME-SNPs as a potential novel immunotherapeutic target.
Next, in Chapter 2, I explored TIME-SNPs in a pediatric cancer cohort collected from multiple databases. Pediatric patients have fewer environmental exposures compared to adults and have traditionally not been good candidates for immunotherapy. Building on analysis from Chapter 1, I processed genotypes from multiple pediatric cancer cohorts and specifically explored TIME-SNPs related to antigen presentation and macrophage infiltration as these were prominent IP components explored in Chapter 1. Also, pediatric cancer patients with mismatch repair (MMR) deficiencies could be potential good candidates for immunotherapy. Thus, we explored germline determinants underlying MMR and their associations with the TIME.
In Chapter 3, I analyzed one of the largest and most diverse genetic databases, the Million Veteran Program, for ancestry-specific associations in testosterone and prostate cancer. Specifically, analyses conducted were 1) a multi-ancestral analysis of total testosterone levels, 2) discovery and evaluation of an African-ancestry specific polygenic risk score and 3) evaluation of a polygenic hazard score for prostate cancer in a multi-ancestry cohort. Through these analyses, I found that genetic associations can differ based on ancestral background and identified novel ancestry-specific associations that improved prostate cancer prediction.
With the culmination of these chapters, I demonstrate that inherited variation underlying the tumor immune microenvironment and diverse populations can improve our understanding of cancer and improve precision medicine efforts.