Causal inference provides tools for researchers to answer scientific and policy questions. The validity of estimated causal effects depends on many factors, from research design to the credibility of the underlying assumptions. The following dissertation addresses three aspects of causal inference: credibility, generalizability, and utility. Each chapter of the dissertation addresses the intersection of these three aspects of causality.
The first chapter examines credibility and generalizability, and introduces a sensitivity analysis framework for estimating externally valid causal effects. When estimating externally valid causal effects, researchers must leverage a conditional ignorability assumption to account for confounding effects from selection into the experimental sample. This assumption allows researchers to theoretically identify generalized (or transported) causal effects; however, like many assumptions in causal inference, this assumption is not testable, and in practice, can be untenable. The proposed framework allows researchers to quantify how much bias there can be in generalizing or transporting causal effects before the estimated effect substantively changes. The contributions in this chapter are three-fold. First, I show that the sensitivity parameters are scale-invariant and standardized, and introduce an estimation approach for researchers to simultaneously account for the bias in their estimates from omitting a moderator, as well as potential changes to their inference. Second, I propose several tools researchers can use to perform sensitivity analysis: (1) graphical and numerical summaries for researchers to assess how robust an estimated effect is to changes in magnitude as well as statistical significance; (2) a formal benchmarking approach for researchers to estimate potential sensitivity parameter values using existing data; and (3) an extreme scenario analysis. Finally, I demonstrate that the proposed framework can be easily extended to the class of doubly robust, augmented weighted estimators. The sensitivity analysis framework is applied to a set of Jobs Training Program experiments.
The second chapter focuses on utility and generalizability. While recent papers developed various weighting estimators for the population average treatment effect (PATE), many of these methods result in large variance because the experimental sample often differs substantially from the target population, and estimated sampling weights are extreme. In the following chapter, we propose post-residualized weighting, in which we use the outcome measured in the observational population data to build a flexible predictive model (e.g., machine learning methods) and residualize the outcome in the experimental data before using conventional weighting methods. We show that the proposed PATE estimator is consistent under the same assumptions required for existing weighting methods, importantly without assuming the correct specification of the predictive model. We examine the efficiency gains in the context of a set of jobs training program experiments, and find that using post-residualized weighting can result between a 5 - 25% reduction in variance over standard approaches.
The final chapter addresses credibility and utility. I introduce a new set of sensitivity models called the "variance-based sensitivity model’’. The variance-based sensitivity model characterizes the bias from omitting a confounder by bounding distributional differences that arise in the weights from omitting a confounder, with several notable innovations over existing approaches. First, the variance-based sensitivity model can be parameterized by an R^2 parameter that is both standardized and bounded. We introduce a formal benchmarking procedure that allows researchers to use observed covariates to reason about plausible parameter values in an interpretable and transparent way. Second, we show that researchers can estimate valid confidence intervals under the variance-based sensitivity model, and provide extensions for incorporating substantive knowledge about the confounder to help tighten the intervals. Last, we demonstrate, both empirically and theoretically, that the variance-based sensitivity model can provide improvements on both the stability and tightness of the estimated confidence intervals over existing methods. We illustrate our proposed approach on a study examining blood mercury levels using the National Health and Nutrition Examination Survey (NHANES).
The results from the dissertation collectively provide a broad range of methods for researchers to estimate causal effects more transparently and robustly.