This dissertation consists of three chapters that study causal inference when applying machinelearning methods. In Chapter 1, I propose an orthogonal extension of the semiparametric
difference-in-differences estimator proposed in Abadie (2005). The proposed estimator
enjoys the so-called Neyman-orthogonality (Chernozhukov et al. 2018) and thus it allows
researchers to flexibly use a rich set of machine learning (ML) methods in the first-step estimation.
It is particularly useful when researchers confront a high-dimensional data set when
the number of potential control variables is larger than the sample size and the conventional
nonparametric estimation methods, such as kernel and sieve estimators, do not apply. I apply
this orthogonal difference-in-differences estimator to evaluate the effect of tariff reduction
on corruption. The empirical results show that tariff reduction decreases corruption in large
magnitude.
In Chapter 2, I study the estimation and inference of the mode treatment effect. Mean,median, and mode are three essential measures of the centrality of probability distributions.
In program evaluation, the average treatment effect (mean) and the quantile treatment
effect (median) have been intensively studied in the past decades. The mode treatment
effect, however, has long been neglected in program evaluation. This paper fills the gap by
discussing both the estimation and inference of the mode treatment effect. I propose both
traditional kernel and machine learning methods to estimate the mode treatment effect. I also
derive the asymptotic properties of the proposed estimators and find that both estimators
follow the asymptotic normality but with the rate of convergence slower than the regular
rate N^1/2, which is different from the rates of the classical average and quantile treatment
effect estimators.
In Chapter 3 (joint with Liqiang Shi), we study the estimation and inference of the doublyrobust extension of the semiparametric quantile treatment effect estimation discussed in
Firpo (2007). This proposed estimator allows researchers to use a rich set of machine learning
methods in the first-step estimation, while still obtaining valid inferences. Researchers
can include as many control variables as they consider necessary, without worrying about the
over-fitting problem which frequently happens in the traditional estimation methods. This
paper complements Belloni et al. (2017), which provided a very general framework to discuss
the estimation and inference of many different treatment effects when researchers apply
machine learning methods.