Multiple Bias Modeling in a Multi-Center Epidemiologic Study of Endometrial Cancer
- Author(s): Thompson, Caroline Avery
- Advisor(s): Arah, Onyebuchi A
- et al.
Quantitative treatment of uncontrolled bias in observational research is a neglected matter. In the dawn of the era of "big data", this is of particular concern because systematic error, as a portion of total error, can be greatly magnified when sample sizes increase. Unfortunately, considerable statistical road blocks exist between performing a basic multivariable analysis of an exposure-disease relationship and the thorough consideration of the direction and magnitude of uncontrolled bias. Most published literature points to the use of external formula adjustment for a thorough treatment of bias, but the formulas are often too simple (and thus unrealistic) or too complex (and thus unwieldy). A practical solution might be to perform the bias adjustment in the data, before analysis is performed. This solution would be especially useful in pooled data consortium projects, which are becoming increasingly popular as a way to investigate rare exposures and disease subtypes in cancer epidemiology, and often employ one shared data source used by multiple investigators simultaneously. Record-level data augmentation for bias analysis is central to a pooling project because it allows for multiple bias parameters to be placed directly in this data source. In this work we utilize causal theory, Monte-Carlo methods, and the missing data framework to contribute the literature of quantitative bias modeling, via flexible algorithms that may be used to translate bias adjustment for unmeasured confounding and non-response directly into the data source, before the analysis stage. We provide proof of concept for these methods via a series of simulation studies, and demonstrate their utility in a large multi-center pooled study of the epidemiology of endometrial cancer, employing both fixed hypothetical and probabilistic empirical priors for our bias parameters. Moving bias adjustment to the pre-analytic stage opens the door for an augmented data set to be analyzed in any conventional way, with no need for a working knowledge of the complex methodology behind existing external formula adjustments. A thorough, accessible, quantitative bias analysis can then serve as a tool to guide qualitative discussions about the impact of systematic error in multi-study data projects.