Skip to main content
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Targeted maximum likelihood estimation of treatment effects in randomized controlled trials and drug safety analysis


In most randomized controlled trials (RCTs), investigators typically rely on estimators of causal effects that do not exploit the information in the many baseline covariates that are routinely collected in addition to treatment and the outcome. Ignoring these covariates can lead to a significant loss is estimation efficiency and thus power. Statisticians have underscored the gain in efficiency that can be achieved from covariate adjustment in RCTs with a focus on problems involving linear models. Despite recent theoretical advances, there has been a reluctance to adjust for covariates based on two primary reasons; 1) covariate-adjusted estimates based on non-linear regression models have been shown to be less precise than unadjusted methods, and, 2) concern over the opportunity to manipulate the model selection process for covariate adjustment in order to obtain favorable results. This dissertation describes statistical approaches for covariate adjustment in RCTs using targeted maximum likelihood methodology for estimation of causal effects with binary and right-censored survival outcomes.

Chapter 2 provides the targeted maximum likelihood approach to covariate adjustment in RCTs with binary outcomes, focusing on the estimation of the risk difference, relative risk and odds ratio. In such trials, investigators generally rely on the unadjusted estimate as the literature indicates that covariate-adjusted estimates based on logistic regression models are less efficient. The crucial step that has been missing when adjusting for covariates is that one must integrate/average the adjusted estimate over those covariates in order to obtain the population-level effect. Chapter 2 shows that covariate adjustment

in RCTs using logistic regression models can be mapped, by averaging over the covariate(s), to obtain a fully robust and efficient estimator of the marginal effect, which equals a targeted maximum likelihood estimator. Simulation studies are provided that demonstrate that this targeted maximum likelihood method increases efficiency and power over the unadjusted method, particularly for smaller sample sizes, even when the regression model is misspecified.

Chapter 3 applies the methodology presented in Chapter 3 to a sampled RCT dataset with a binary outcome to further explore the origin of the gains in efficiency and provide a criterion for determining whether a gain in efficiency can be achieved with covariate adjustment over the unadjusted method. This chapter demonstrates through simulation studies and the data analysis that not only is the relation between $R^2$ and efficiency gain important, but also the presence of empirical confounding. Based on the results of these studies, a complete strategy for analyzing these type of data is formalized that provides a robust method for covariate adjustment while protecting investigators from misuse of these methods for obtaining favorable inference.

Chapters 4 and 5 focus on estimation of causal effects with right-censored survival outcomes. Time-to-event outcomes are naturally subject to right-censoring due to early patient withdrawals. In chapter 4, the targeted maximum likelihood methodology is applied to the estimation of treatment specific survival at a fixed end-point in time. In chapter 5, the same methodology is applied to provide a competitor to the logrank test. The proposed covariate adjusted estimators, under no or uninformative censoring, do not require any additional parametric modeling assumptions, and under informative censoring, are consistent under consistent estimation of the censoring mechanism or the conditional hazard for survival. These targeted maximum likelihood estimators have two important advantages over the Kaplan-Meier and logrank approaches; 1) they exploit covariates to improve efficiency, and 2) they are consistent in the presence of informative censoring. These properties are demonstrated through simulation studies.

Chapter 6 concludes with a summary of the preceding chapters and a discussion of future research directions.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View