Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Essays on Treatment Effect Heterogeneity in Education Policy Interventions

No data is associated with this publication.
Abstract

The key focus of this dissertation is on how to understand and measure treatment effect heterogeneity in experimental or quasi-experimental evaluations of educational policy interventions. When testing the impact of an intervention, it can be important to know not just the overall or average effect of the intervention on key outcomes but also how the effect varies across subgroups of study participants, as defined by several dimensions including their pre-treatment characteristics, site-level contexts, and the distribution of an outcome measure. Heterogeneity or variation in effects has critical implications for understanding how interventions work and which aspects of an intervention's implementation are most closely associated with its effectiveness. This dissertation examines both methodological and substantive questions that pertain to such heterogeneity.

In Chapter 1, I examine Bayesian hierarchical models for multi-site trials that allow estimation of site-specific treatment effects and their distribution. Modeling site-specific effects using observed data is a critical component in understanding the results of multisite trials. A standard approach leveraging Bayesian methods is to rely on Gaussian distributional assumptions and to use the posterior means (PM) of the random effects. The standard approach can be misleading, however, in the estimation of individual site-specific effects and their empirical distribution and ranks. In this chapter, I review the following two strategies developed to improve inferences regarding site-specific effects: (a) relaxing the normality assumption by flexible modeling of the random-effects distribution using Dirichlet process mixture (DPM) models, and (b) replacing the choice of PM as the summary of the posterior by alternative estimators, such as the constrained Bayes (CB) or the triple-goal (GR) estimators. I then examine when and to what extent the two strategies and combinations thereof work or fail under varying conditions.

In Chapter 2, I study methodological issues arise in the practice where Bayesian quantile regression (BQR) models are applied. The BQR models allow us to study treatment effect heterogeneity across the distribution of an outcome measure such as a student achievement test score. In BQR, the most commonly applied likelihood is the asymmetric Laplace (AL) likelihood because it is computationally convenient for Markov chain Monte Carlo algorithms. For easier computation, the scale parameter of the AL distribution is often fixed at a pre-estimated value or an arbitrary constant. This paper demonstrates that posterior inference in BQR with an AL likelihood is highly sensitive to the choice of the fixed scale parameter. Based on sensitivity analyses using Monte Carlo simulations and a real data example, I make two claims. First, not only the variance directly obtained from the posterior distribution, but also the adjusted posterior variance proposed by Yang et al. (2015), is highly sensitive to the value of the scale parameter. Second, in finite samples, both conventional and Bayesian point estimators can be biased at extreme quantiles. Researchers need to be aware of the possibility of low coverage probabilities at extreme quantiles mainly caused by biased point estimates.

In Chapter 3, I examine the use of the grouped/multilevel instrumental variable (IV) quantile regression approach, a quantile extension of Hausman and Taylor (1981). The common approach of estimating the shift of group-level (level-2) averages of individual-level (level-1) outcomes may mask important but more subtle effects on the outcome distribution. For example, a school-level intervention may have little effect on school-level average test score but may cause a substantial shift in the lower quantiles of the within-school test score distributions if the intervention is particularly beneficial for low-performing students. As one real-world empirical example, I used the grouped/multilevel IV quantile approach to estimate the effects of district-level increases in per-pupil spending on quantiles of the within-district distribution of school quality measures. I show how new dollars flowing to districts did affect varying mixes of teachers and organizational practices inside schools, but in ways that mitigated against narrowing disparities. Better funded high schools reduced access to college-prep courses relative to electives, and novice teachers were often assigned to courses serving English learners, inequities that widened in high-poverty schools.

Main Content

This item is under embargo until February 16, 2025.