While most common variants associated with celiac disease have now been identified through genome-wide association studies (GWAS), outstanding questions still exist regarding the validity of previously identified common variants, the presence of common variants associated with celiac disease within the major histocompatibility complex (MHC) region of chromosome 6 that are independent of the high-risk HLA genotypes, and the presence of low-frequency and rare variants associated with celiac disease within previously implicated genomic regions. This dissertation sought to study all of these questions by employing GWAS, fine-mapping methods, meta-analysis methods, imputation, and next-generation sequencing (NGS) of targeted genomic regions.
In the first study of this dissertation, two large-scale celiac disease GWASs were re-analyzed using alternative random-effects meta-analysis models in addition to the fixed-effects approach employed in each GWAS meta-analysis. Implementing a random-effects meta-analysis model did not appreciably increase or decrease the power to detect an association and nearly all of the previously implicated loci were found to be genome-wide significant in the re-analysis. In the second study, a fine-mapping approach of the MHC region that takes into account the effect of the high-risk HLA genotypes was implemented. After adjustment for the high-risk HLA genotypes and the linkage disequilibrium in the MHC region, seven novel loci were found to be associated with celiac disease. In the third study, targeted NGS-based resequencing was performed on previously implicated genomic regions to test for the presence of low-frequency and rare variants associated with celiac disease. Gene-based collapsing tests revealed that dozens of genes harbor low-frequency and rare variants that are associated with celiac disease, particularly in the MHC region and within non-coding regions of genes. The fourth study implemented a variant imputation method to impute low-frequency and rare variants into a large GWAS dataset to increase the statistical power to detect low-frequency and rare variants. Nearly all of the low-frequency and rare variant associations from the third study were replicated in this fourth study along with novel associations, using both gene-based tests and single-marker association tests.
These studies reveal that there are many more loci that need to be carefully followed-up in larger resequencing studies and functional studies than previously acknowledged by large-scale celiac disease GWAS that do not account for the role of the high-risk HLA genotypes or the low-frequency and rare variants within genomic regions that harbor common variants previously found to be associated with celiac disease.