Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Targeted Minimum Loss Based Estimation: Applications and Extensions in Causal Inference and Big Data

Abstract

Causal inference generally requires making some assumptions on a causal mechanism followed by statistical estimation. The statistical estimation problem in causal inference is often that of estimating a pathwise differentiable parameter in a semiparametric or nonparametric model. Targeted minimum loss-based estimating (TMLE) is a framework for constructing an asymptotically linear plug-in estimator for such parameters.

The natural direct effect (NDE) is a parameter that quantifies how some treatment affects some outcome directly, as opposed to indirectly through some mediator value between the treatment and outcome on the causal pathway. In Chapter 2, we introduce the NDE among the untreated and show that under some assumptions the NDE among the untreated is identifiable and equivalent to a statistical parameter as the so called average treatment effect among the untreated. We then present a locally efficient, doubly robust TMLE for the statistical target parameter and apply it to the estimation of the NDE among the untreated in simulations and of the NDE in a data set from an RCT.

Some estimators that adjust for the propensity score (PS) nonparametrically, such as PS matching or stratification by the PS, are robust to slight misspecification of the PS estimator. In particular, if the PS estimator fails to estimate the true propensity score, but still approximates some other balancing score, such methods are still consistent for average treatment effect (ATE). In Chapter 3, we extend a traditional TMLE for the ATE to have this property while still being locally efficient and doubly robust and investigate the performance of the proposed estimator in a simulation study.

Online estimators are estimators that process a relatively small piece of a data set at a time, and can be updated as more data becomes available. Typically, online estimators are used in the large scale machine learning literature, but to our knowledge, have not been used to estimate statistical parameters associated with causal parameters. In Chapter 4, we propose two online estimators for the ATE that are asymptotically efficient and doubly robust in a single pass through a data set. The first is similar to the augmented inverse probability of treatment weighting estimator in the batch setting, and the second involves an additional targeting step inspired by TMLE, which improves performance in some cases. We investigate the performance of both in a simulation study.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View