Skip to main content
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Applications of Semi-parametric Estimation Methods in Causal Inference and Prediction


In this thesis, we argue for the use of loss-based semi-parametric estimation methods as an alternative to traditional parametric models in causal inference and prediction. We present a brief discussion on "black box" epidemiology in the first chapter and argue that risk factor epidemiology can be improved by using semi-parametric estimation methods. We demonstrate the use of semi-parametric methods by applying them to two different problems: one in causal inference and another in prediction. In each case, we demonstrate the process one would follow to define the question of interest, parameterize this question, and estimate it using semi-parametric methods. In the second chapter we introduce a formal concept of a perception effect, and define unmasking and placebo effects in the context of randomized trials. We employ modern tools from causal inference to derive semi-parametric estimators of such effects. The methods are illustrated on a motivating example from a recent pain trial where the occurrence of treatment-related side effects acts as a proxy for unmasking. In the third chapter, we redefine perception and unmasking effects for a longitudinal setting, and explore various causal graphs for the gabapentin trial. We demonstrate application of the semi-parametric methods in this more general setting by assuming a more complicated causal graph. To estimate the parameters, we use Maximum Likelihood Estimation and two different versions of Targeted Maximum Likelihood Estimation. Finally, in chapter four, we approach coronary heart disease risk prediction modeling from a semi-parametric perspective using data from the Framingham study. The "super learner" is used with a library of machine learning algorithms to create an ensemble risk prediction model for coronary heart disease. We define relative risk importance parameters for various risk factors and estimate them with semi-parametric methods used in earlier chapters. The results are compared to the Framingham study and those obtained by fitting a parametric model to the Framingham dataset.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View