This dissertation covers three distinct topics in survival analysis: 1) current status data in the context of group testing subject to misclassification; 2) marginal structural modeling of a safety outcome from clinical trial data; and 3) the relationship between preterm birth and weight gain in pregnancy. Abstracts for each chapter separately are presented below.
Chapter 2. Group testing, introduced by Dorfman (1943), has been used to reduce costs when estimating the prevalence of a binary characteristic based on a screening test of k groups that include n independent individuals in total. If the unknown prevalence is low, and the screening test suffers from misclassification, it is also possible to obtain more precise prevalence estimates than those obtained from testing all n samples separately (Tu et al., 1994). In some applications, the individual binary response corresponds to whether an underlying time-to-event variable T is less than an observed screening time C, a data structure known as current status data. Given sufficient variation in the observed Cs, it is possible to estimate the distribution function, F, of T nonparametrically, at least at some points in its support, using the pool-adjacent-violators algorithm (Ayer et al., 1955). Here, we consider nonparametric estimation of F based on group tested current status data for groups of size k where the group tests positive if and only if any individual's unobserved T is less than its corresponding observed C. We investigate the performance of the group-based estimator as compared to the individual test nonparametric maximum likelihood estimator, and show that the former can be more precise in the presence of misclassification for low values of F(t). Potential applications include testing for the presence of various diseases from pooled samples where interest focuses on the age at incidence distribution rather than overall prevalence. We apply this estimator to the age-at-incidence curve for hepatitis C infection in a sample of U.S. women who gave birth to a child in 2014, where group assignment is done at random and based on maternal age. We discuss the relationship to other work in the literature, and potential extensions.
Chapter 3. Marginal structural modeling was first developed to address time-dependent confounding in studies where the effect of a time-varying exposure on an outcome is of interest. This chapter begins by introducing the reader to the concept of time-dependent confounding, and describes inverse probability weighting estimators for parameters of marginal structural models. The second part of chapter 3 contains an application of marginal structural modeling in a drug safety study. Studies in pharmacoepidemiology are often conducted in rich data sources, such as clinical trials or administrative databases, where large quantities of information are collected repeatedly over time. These data sources can and should be exploited, but traditional methods often cannot incorporate all available data, and fail to take time-dependent confounding into account. Marginal structural modeling and weighted estimators, tools often used in observational studies, can help to alleviate these challenges.
Our objective in this study was to estimate the relation between rheumatoid arthritis (RA) disease activity, cholesterol levels, and major adverse cardiovascular events (MACE) in patients with moderate to severe rheumatoid arthritis who are currently prescribed tocilizumab, accounting for the presence of time-dependent confounding, such as other inflammatory markers, lipid levels, and rheumatoid arthritis disease measures. We studied 3,986 patients enrolled in one of five clinical trials used to study tocilizumab, who then joined one of three long-term extension studies. We used a weighted logistic regression model to explore associations between pre-treatment levels of RA disease activity and cholesterol on the 5-year risk of MACE. We then used a logistic marginal structural model to explore causal relations between pre- and post-treatment RA disease activity and cholesterol levels, and 5-year risk of MACE, adjusting for time-dependent confounders. We did not find evidence that pre- or post-treatment levels of RA disease activity, HDL cholesterol, and LDL cholesterol were associated with increased risk of MACE in patients with moderate to severe rheumatoid arthritis taking tocilizumab, once time-dependent confounding from inflammatory markers and other lipid levels was taken into account. After adjustment for time dependent confounding, traditional markers of disease activity and cholesterol were not associated with an increased risk of cardiac events among RA patients treated with tocilizumab.
Chapter 4. The relationship between weight gain in pregnancy and preterm birth is still contested due to their inherent dependence. In the first part of Chapter 4, we wanted to quantify the relationship between pregnancy weight gain with early and late preterm birth and evaluate whether associations differed between non-Hispanic (NH) black and NH white women. We analyzed a retrospective cohort of all live births to NH black and NH white women in the U.S. 2011-2015 (n = 10,714,983). We used weight gain z-scores in multiple logistic regression models, stratified by prepregnancy body mass index (BMI) and race, to calculate population attributable risks (PAR) and PAR percentages for early and late preterm birth. We found that both low and high pregnancy weight gain were related to preterm birth, but these associations varied by BMI and race, and differed from associations with late preterm birth. For high weight gain and early preterm birth, the PAR percentage ranged from 8-10% in NH black women and from 6-8% in NH white women. Racial differences were small or nonexistent for late preterm birth, with PAR percentages ranging from 2-7% in NH black women and from 3-7% in NH white women. We conclude that these findings add to evidence that moderate gestational weight gain could help prevent preterm birth, and suggest that the impact may be greatest for early preterm birth in NH black women.
The second part of Chapter 4 is a preliminary analysis assessing the variety of measures of weight gain in pregnancy and their relationship with preterm birth. Serial GWG measurements provide ideal data, but are rarely available in population health datasets. The electronic medical records from 160,635 women in Sweden have been compiled to be the largest dataset in the world that contains repeated weight gain measures through pregnancy. Here, we describe the pattern of weight gain in pregnancy in 103,661 Swedish pregnancies, and assess whether the observed pattern before 37 weeks' gestation differs between preterm and term pregnancies.