Standardized testing of K-12 students has become common across countries in the last decade. With the No Child Left behind Act (2001), the U.S. mandated all 3rd-8th grader public school students to be tested every year with the promise that value-added measures of teacher quality, computed from student test scores, may be useful tools to help manage the teacher workforce and improve the efficiency of schools. But this requires that they be reliable estimates of teacher effectiveness. Because value-added scores are based on observational data, they may be biased by systematic patterns in the assignments of students to teachers. Whether assignment processes permit unbiased estimation of teacher value-added is a matter of great dispute.
In the first chapter, I loosen the assumption from past research that assignment processes are identical at all schools. Using variance decomposition techniques on the prior year test score of children, I develop a classification procedure that allows me to identify schools where assignments are random, and schools where assignments are nonrandom. I show that about 55% of elementary schools in North Carolina systematically sort students with higher and lower scores on previous year's tests to different classes - a pattern of student “tracking”. About 80% of these schools also allocate the classes of high and low achievers to the same teachers year after year – a pattern of “matching” teachers to certain students. In a descriptive analysis, I explore what school- and district-level observable characteristics are predictive of a school's sorting practices. I show that larger schools, schools with and more heterogenous student populations in terms of achievement and free- and reduced price lunch status, and higher teacher turnover are more likely to engage in tracking & matching. School district effects explain about 30% of the variation in school-level assignment policies, mainly due to the district-level socioeconomic environment.
In the second chapter, I leverage the variation in classroom assignment practices to learn about the magnitude of biases in teacher value-added estimates. Biases are most likely in tracking & matching schools, and least likely in schools that are neither tracking nor matching (random schools). I use data on teachers who stay in random assignment schools for two consecutive years as a control group, and compare them to teachers who switch between these schools and tracking & matching schools. In a minimum distance framework, using the autocovariance of teachers’ value-added staying in or moving between different types of schools, I can identify both the variance of the bias and the covariance with teachers’ true effects. I document substantial biases in value-added measures. Importantly, these biases are negatively correlated with teachers' true effects -- teachers who are above average in their true effectiveness tend to be assigned students who make them look bad. This negative correlation helps to explain the discrepancy with previous results: Assuming that the correlation is zero when it is in fact negative, the variance of the bias is understated by about half. Overall, I conclude that the quality of value-added assessments is likely to depend on the nature of the student-teacher allocation process used at specific schools or school systems.
In the third chapter, I estimate the effect of peers, defined as teachers at the same school and grade level, on the own value-added of teachers. Traditional estimates using leave-out means imply significant and large positive spillovers among teachers. However, I exploit a more compelling research design following Mas et al. (2009), and approximate the following thought experiment in a regression framework: A low value-added teacher, Teacher A is randomly replaced by a new, high value-added teacher, Teacher D, at a particular school-grade level. How does the value-added of incumbent teachers B and C, who have worked at the same school-grade the year before, change in response? I find that the replacement of teacher by a 1 standard deviation (SD) better teacher, incumbent peers' value added increases by 0.05-0.12 SDs. When restricting the sample to non-expanding and non-contracting school-grade levels, and I control for unobserved shocks, spillovers in random assignment schools become insignificantly different from 0, while remain consistently positive at around 0.06-0.1 SDs in tracking & matching schools. Looking at changes in student observables in the incumbents' classrooms reveals that assignments in random schools do not change, just as expected. Therefore, the coefficient in these schools is a clean estimate of peer effects, which in turn are small and insignificant. However, average prior achievement in the incumbent teachers's classrooms significantly decrease in tracking & matching schools, and this may cause the significant spillovers in these schools. The exact mechanism is the subject of further research.