UC San Diego
Statistical methods for genetic association analysis involving complex longitudinal data
- Author(s): Salem, Rany Mansour
- et al.
Most, if not all, human phenotypes exhibit a temporal, dosage-dependent, or age effect. In this work, I explore and showcase the use different analytical methods for assessing the genetic contribution to traits with temporal trends, or what I refer to as 'dynamic complex traits' (DCTs). The study of DCTs could offer insights into disease pathogenesis that are not achievable in other research settings. I describe the development and application of a method of DCT analysis termed ̀Curve- Based Multivariate Distance Matrix Regression' (CMDMR) using data from a structured longitudinal clinical study to demonstrate the approach in genetic association analysis (Chapter 2). The method was found to perform as well as or better than traditional statistical methods that might be applied to DCTs. I also applied the CMDMR method in conducting a genome wide association (GWA) study of height that essentially exploits dissimilarity among the longitudinal height profiles of individuals with different genotypes (Chapter 3). This framework is applied to height growth data from the Bogalusa Heart Study. I identified 7 novel variants in 6 loci (FAM19A1, FGF20, SCD5, MAP3K7, GLCCI1 and TJP2) associated with height profiles using parametric curves (all p-values <1e-6). I also was able to replicate previously reported adult height associated genetic variations in the analysis. This is the first GWA study to fully utilize longitudinal data. Finally, I considered approaches to the analysis of 'Longitudinal Unstructured Clinical Information (LUCI)' using a variety of mixed model approaches (Chapter 4). These approaches were showcased and contrasted on two independent clinical studies and datasets to assess the influence of genetic variations on longitudinal glomerular filtration rate (GFR) profiles. The first study is a clinical trial with a pre-specified temporal measurement collection patterns, whereas the second study involved the analysis of data abstracted from actual clinic-derived longitudinal medical records. Through careful consideration of data source issues, potential biases, and planning and analysis, consistency in results was found from both studies. In addition, a novel association between GFR profile and a 10-bp deletion-insertion polymorphism in coagulation factor VII at position -323 was identified. I conclude that both novel and extensions of traditional mixed model approaches will be useful in the genetic analysis of DCTs despite the LUCI-associated problems.