Behavioral Types: A New Perspective on Estimating Treatment Eﬀects in Social Science Experiments with Binary Responses
In the year 2000, Gerber and Green published the results of a ﬁeld experiment which examined the impact of electoral campaigns on voter participation. Since this landmark work, more than one hundred similar studies have appeared in the political science literature. These randomized controlled trials are usually conducted within get-out-the-vote (GOTV) drives seeking to increase voter turnout. The surge in GOTV experiments was partly due to a statistical innovation that preceded Gerber and Green’s publication, the average treatment eﬀect for the treated (ATT), which allowed researchers to compare directly those treated to a similar group that was assigned to the control.
In this research we focus on settings common to many social science ﬁeld experiments such as those of GOTV studies, where participants may comply, or not, with the treatment protocol assigned by the experimenter. For experiments with binary outcomes, we show that each individual in the study may be classiﬁed as one of a ﬁnite number of distinct types. We call these behavioral types because they characterize the individual’s complete reaction, their measured response and how they receive treatment, to the assignment of each possible experimental group. In this context, the data is generated by randomly allocating these various behavioral types to the diﬀerent levels of treatment. Thus, the model is parameterized by the unknown proportions of the diﬀerent behavioral types so that many statistical aspects of the experiment, such as commonly studied average treatment eﬀects, may be written as a function of these proportions.
Viewing the data as generated by these behavioral types changes the analysis of the experiment in two ways. First, it changes the perspective on what is being estimated. Instead of ﬁnding a particular treatment eﬀect, the ultimate goal can be seen as estimating proportions of behavioral types. With this frame of reference, the eﬀect of a certain treatment will be most accurately represented as the fraction of the experimental sample for which the treatment has an eﬀect. Second, by clarifying the underlying data generating process, a behavioral-types approach directs the resulting statistical analysis.
We use a well cited example to introduce behavioral types before providing formal deﬁnitions. We present the ATT as a case study for how to apply a behavioral-types approach for a design known to many social science researchers. The understanding of the data generating process allows us to evaluate the bias and variance of the ATT estimator, and we show the variance depends on the choice of the sampling assumptions. We then provide rigorous deﬁnitions of a behavioral type and of restrictions which reduce the number of behavioral types in a population to a number where the proportions of each type may be estimated. We present three experimental designs and present a strategy to identify the proportions of each type and elucidate how treatment eﬀects may be found from the proportions.
A behavioral-types approach is well suited to multi-treatment experiments because it distills often complex designs into an estimation problem of a manageable number of types. We apply the behavioral types approach to four published social science ﬁeld experiments involving multiple levels of ordered treatment. For each, we show how the interpretations and the statistical analyses diﬀer with a behavioral-types approach, and can lead to diﬀerent conclusions. Through the applications we illustrate how behavioral types provides insight into a range of experimental designs, such as those with spillover eﬀects or partial ordering of treatment levels.
For two of the four applications we further examine the issue of joint signiﬁcance by constructing multi-dimensional conﬁdence regions for the proportion of behavioral types. We ﬁnd that normal approximation methods perform poorly, but the shortcomings can be corrected by the bootstrap. However, even the bootstrap regions may not attain the desired coverage levels, so we adjust our regions using a double bootstrap. We discuss other methods that merit further exploration.