Semiparametric Statistical Methods for Causal Inference with Stochastic Treatment Regimes
Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Semiparametric Statistical Methods for Causal Inference with Stochastic Treatment Regimes

Abstract

Nearly a century ago, the foundations of modern statistics laid the groundwork for a science of causality. Today, causal inference is central to the study of the most impactful questions at the intersection of science and policy: By what mechanisms do novel therapeutics mitigate relapse in addiction disorders? How do immunobiological markers mediate action mechanisms of vaccines? While randomization provides "gold standard" tools for quantifying causal effects, such trials are costly and limit the scope of scientific inquiry. Thus, techniques for statistical causal inference with complex, observational data are critical to today's, and tomorrow's, scientific endeavors.

Observational studies obviate many of the shortcomings of randomized trials but bring their own challenges and promises. Without randomization, causal inference is plagued by confounding: vaccinees may be more likely to engage in risky behaviors and patients assigned a candidate therapeutic are not uniformly "treated" due to clinician heterogeneity. Adjusting for potential confounders is a daunting challenge in an era where studies routinely measure numeroushigh-dimensional, longitudinal characteristics. Further, observational studies empower scientists to assess mechanistic, path-specific causal effects that cannot be learned with randomized data. Tools from non/semi-parametric statistical theory and machine learning are needed to avoid imposing unrealistic statistical assumptions, and novel causal effect estimands are required to better address mechanistic questions.

Causal inference methodology is critical to answering real-world scientific questions, but traditional approaches make too many simplifying assumptions. By ignoring biased sampling designs, continuous-valued treatments, and confounding of path-specific effects, standard statistical methods fall far short of empowering mechanistic discovery. Such techniques often require a priori modeling assumptions unsupported by domain knowledge, limiting their utility for real-world data analysis efforts.

This dissertation extends theory and methods for non/semi-parametric causal inference in settings with continuous treatments, with particular attention paid to issues emerging from biased sampling designs and path-specific causaleffects. Stochastic treatment regimes provide a unifying framework for formalizing such causal inference problems. Chapter one considers estimation of the generalized propensity score, a quantity critical to estimating the causal effects of stochastic interventions. To tractably estimate this challenging quantity, we formulate algorithms for its flexible estimation using the highly adaptive lasso, a nonparametric regression estimator. We then develop a novel inverse probability weighted estimator of the causal effect of a stochastic intervention, and show it capable of achieving the non/semi-parametric efficiency bound. Chapter two focuses on the application of the causal effects of stochastic interventions in real-world studies that rely upon outcome-dependent two-phase sampling (e.g., case-cohort designs). The work includes a methodological advance that unites techniques for estimating the causal effects of stochastic interventions with corrections for biased sampling, allowing for these complex causal parameters to be efficiently estimated under such designs. Motivated by the aims of an HIV vaccine efficacy trial, this contribution allows researchers to probe how the vaccination-induced immunogenicity of candidate immune correlates of protection may best be modulated by future vaccines, and the proposed methodology is demonstrated through a re-analysis of the data from this trial. The COVID-19 pandemic took the world by storm during the final year of this work, and Chapter three generalizes the methodology proposed in Chapter two to help maximize what can be learned from the critical and timely scientific inquiries posed by COVID-19 vaccine trials. Chapter four examines path-specific causal effects (i.e., causal mediation analysis), formulated based upon stochastic interventions, introducing a new class of direct and indirect effect parameters robust to intermediate confounding. Developing non/semi-parametric efficient techniques for the flexible estimation of these path-specific effects, facilitates their use in quantifying mechanistic knowledge and extracting actionable insights from modern, large-scale studies. Chapter five discusses open source software packages for causal inference, which implement the statistical methodology discussed prior. Chapter six concludes with a discussion of avenues that may motivate future research.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View