# Your search: "author:Sekhon, Jasjeet S."

## filters applied

## Type of Work

Article (5) Book (0) Theses (4) Multimedia (0)

## Peer Review

Peer-reviewed only (8)

## Supplemental Material

Video (0) Audio (0) Images (0) Zip (0) Other files (1)

## Publication Year

## Campus

UC Berkeley (9) UC Davis (0) UC Irvine (0) UCLA (0) UC Merced (0) UC Riverside (0) UC San Diego (0) UCSF (0) UC Santa Barbara (0) UC Santa Cruz (0) UC Office of the President (0) Lawrence Berkeley National Laboratory (2) UC Agriculture & Natural Resources (0)

## Department

Institute of Governmental Studies (1)

## Journal

## Discipline

## Reuse License

## Scholarly Works (9 results)

High-dimensional and causal inference are topics at the forefront of statistical research. This thesis is a unified treatment of three contributions to these literatures. The first two contributions are to the theoretical statistical literature; the third puts the techniques of causal inference into practice in policy evaluation.

In Chapter 2, we suggest a broadly applicable remedy for the failure of Efron’s bootstrap in high dimensions is to modify the bootstrap so that data vectors are broken into blocks and the blocks are resampled independently of one another. Cross-validation can be used effectively to choose the optimal block length. We show both theoretically and in numerical studies that this method restores consistency and has superior predictive performance when used in combination with Breiman’s bagging procedure. This chapter is joint work with Peter Hall and Hugh Miller.

In Chapter 3, we investigate regression adjustment for the modified outcome (RAMO). An equivalent procedure is given in Rubin and van der Laan [2007] and then in Luedtke and van der Laan [2016]; philosophically similar ideas appear to originate in Miller [1976]. We establish new guarantees when the procedure is applied in designed experiments (where the propensity score is known a priori) and confirm that the procedure is doubly robust. RAMO can be implemented in only a few lines of code and it can be immediately combined with existing regression models, including random forests and deep neural networks, used in classical prediction problems. This chapter is joint work with Bin Yu and Jasjeet Sekhon.

In Chapter 4, we investigate the specific deterrent effect of traffic citations. In Queensland, Australia many speeding and red-light running offenses are detected by traffic cameras and drivers are notified of the citation, not at the time they commit the offense, but when the citation notice is delivered by mail about two weeks later. We use a regression discontinuity design to assess whether the chance of crashing or recidivism changes at the moment of notification. We analyzed a population of nearly 3 million drivers who committed camera-detected offenses. We conclude that there is not a significant change in the incidence of crashes but there is a marked decrease in recidivism of about 25%. This chapter is joint work with David Studdert and Jeremy Goldhaber-Fiebert.

This dissertation explores methodological topics in the analysis of randomized experiments, with a focus on weakening the assumptions of conventional models.

Chapter 1 gives an overview of the dissertation, emphasizing connections with other areas of statistics (such as survey sampling) and other fields (such as econometrics and psychometrics).

Chapter 2 reexamines Freedman's critique of ordinary least squares regression adjustment in randomized experiments. Using Neyman's model for randomization inference, Freedman argued that adjustment can lead to worsened asymptotic precision, invalid measures of precision, and small-sample bias. This chapter shows that in sufficiently large samples, those problems are minor or easily fixed. OLS adjustment cannot hurt asymptotic precision when a full set of treatment-covariate interactions is included. Asymptotically valid confidence intervals can be constructed with the Huber-White sandwich standard error estimator. Checks on the asymptotic approximations are illustrated with data from a randomized evaluation of strategies to improve college students' achievement. The strongest reasons to support Freedman's preference for unadjusted estimates are transparency and the dangers of specification search.

Chapter 3 extends the discussion and analysis of the small-sample bias of OLS adjustment. The leading term in the bias of adjustment for multiple covariates is derived and can be estimated empirically, as was done in Chapter 2 for the single-covariate case. Possible implications for choosing a regression specification are discussed.

Chapter 4 explores and modifies an approach suggested by Rosenbaum for analysis of treatment effects when the outcome is censored by death. The chapter is motivated by a randomized trial that studied the effects of an intensive care unit staffing intervention on length of stay in the ICU. The proposed approach estimates effects on the distribution of a composite outcome measure based on ICU mortality and survivors' length of stay, addressing concerns about selection bias by comparing the entire treatment group with the entire control group. Strengths and weaknesses of possible primary significance tests (including the Wilcoxon-Mann-Whitney rank sum test and a heteroskedasticity-robust variant due to Brunner and Munzel) are discussed and illustrated.

- 1 supplemental PDF

With the rise of large and fine-grained data sets, there is a desire for researchers, physicians, businesses, and policymakers to estimate the treatment effect heterogeneity across individuals and contexts at an ever-greater precision to effectively allocate resources, to adequately assign treatments, and to understand the underlying causal mechanism. In this thesis, we provide tools for estimating and understanding the treatment heterogeneity.

Chapter 1 introduces a unifying framework for many estimators of the Conditional Average Treatment Effect (CATE), a function that describes the treatment heterogeneity. We introduce meta-learners as algorithms that can be combined with any machine learning/regression method to estimate the CATE. We also propose a new meta-learner, the X-learner, that can adapt to structural properties such as the smoothness and sparsity of the underlying treatment effect. We then present its desirable properties through simulations and theory and apply it to two field experiments.

As part of this thesis, we created an R package, causalToolbox, that implements eight CATE estimators and several tools that are useful to estimate the CATE and understand the underlying causal mechanism. Chapter 2 focuses on the causalToolbox package and explains how the package is structured and implemented. The package uses the same syntax for all implemented CATE estimators. That makes it easy for appliers to switch between estimators and compare different estimators on a given data set. We give examples of how it can be used to find a well-performing estimator for a given data set, how confidence intervals for the CATE can be computed, and how estimating the CATE for a unit with many CATE estimators simultaneously can give practitioners a sense for which estimates are unstable and depend heavily on the chosen estimator.

Chapter 3 is an application of the causalToolbox package. It shows how useful it is in a simulation study that has been set up for the Empirical Investigation of Methods for Heterogeneity Workshop at the 2018 Atlantic Causal Inference Conference by Carlos Carvalho, Jennifer Hill, Jared Murray, and Avi Feller, based on the National Study of Learning Mindsets.

When implementing the CATE estimators, we noticed that there was a need for a variation of the Random Forests (RF) algorithm that works particularly well for statistical inference. We designed an R package, forestry, that implements a new version of the RF algorithm and several tools for statistical inference with it. In Chapter 4, we describe the problem that confidence interval estimation with RF can perform poorly in areas where RF are biased or in areas outside of the support of the training data. We then introduce a new method that allows us to screen for points for which our confidence intervals methods should not be used.

CATE estimates can be used to assign treatments to subjects, but in many studies, estimating the CATE is not the ultimate goal. Researchers often want to understand the underlying causal mechanisms. In Chapter 5, we discuss a modification of the RF algorithm that is particularly interpretable and allows practitioners to understand the underlying mechanism better. Usually, RF are based on deep regression trees that are difficult to understand. In this new version of the RF, we use linear response functions and very shallow trees to make the results more easily understandable. The algorithm finds splits in quasi-linear time and locally adapts to the smoothness of the underlying response functions. In an experimental study, we show that it leads to shallow and interpretable trees that compare favorably to other regression estimators on a broad range of real-world data sets.

Social sciences offer particular challenges to statistics due to difficulties such as conducting randomized experiments in this domain, the large variation in humans, the difficulty in collecting complete datasets, and the typically unstructured nature of data at the human scale. New technology allows for increased computation and data recording, which has in turn brought forth new innovations for analysis.

Because of these challenges and innovations, statistics in the social sciences is currently thriving and vibrant.

This dissertation is an argument for evaluating statistical methodology in the social sciences along four major axes: \emph{validity}, \emph{interpretability}, \emph{transparency}, and \emph{employability}. We illustrate how one might develop methods that achieve these four goals with three case studies.

The first is an analysis of post-stratification, a form of covariate adjustment to evaluate treatment effect. In contrast to recent results showing that regression adjustment can be problematic under the Neyman-Rubin model, we show post-stratification, something that can easily done in, e.g., natural experiments, has a similar precision to a randomized block trail as long as there are not too many strata. The difference is $O(1/n^2)$. Post-stratification thus potentially allows for transparently exploiting predictive covariates and random mechanisms in observational data. This case study illustrates the value of analyzing a simple estimator under weak assumptions, and of finding similarities between different methodological approaches so as to leverage earlier findings to a new domain.

We then present a framework for building statistical tools to extract topic-specific key-phrase summaries of large text corpora (e.g., the New York Times) and a human validation experiment to determine best practices for this approach. These tools, built from high-dimensional, sparse classifiers such as L1-logistic regression and the Lasso, can be used to, for example, translate essential concepts across languages, investigate massive databases of aviation reports, or understand how different topics of interest are covered by various media outlets. This case study demonstrates how more modern methods can be evaluated using external validation in order to demonstrate that they produce meaningful and comprehendible results that can be broadly used.

The third chapter presents the trinomial bound, a new auditing technique for elections rooted in very minimal assumptions. We demonstrated the usability of this technique by, in November 2008, auditing contests in Santa Cruz and Marin counties, California.

The audits were risk-limiting, meaning they had a pre-specified minimum chance of requiring a full hand count if the outcomes were wrong. The trinomial bound gave better results than the Stringer bound, a tool common in accounting for analyzing financial audit samples drawn with probability proportional to an error bound. This case study focuses on generating methods that are employable and transparent so as to serve a public need.

Throughout, we argue that, especially in the difficult domain of the social sciences, we must spend extra attention on the first axis of validity. This motivates our using the Neyman-Rubin model for the analysis of post-stratification, our developing an approach for external, model-independent validation for the key-phrase extraction tools, and our minimal assumptions for election auditing.