More on Measuring Two-Party Competition: A Response to Dunleavy

Abstract
 Gaines and Taagepera [(2013) How to operationalize two-partyness. Journal of Elections Public Opinion and Parties, 23(4), pp. 387–404] propose two indices of two-party competition for district-level data, both of which are alleged to be flawed. The case against them rests mainly on whether or not elections with one dominant party should be regarded as exhibiting one- or two-way competition. For those inclined to see 90–10% and 50–50% outcomes are different in kind, our indices can provide better measures than the popular effective number of parties or the “gap”. We agree that assessment of a set of outcomes, in a given election or over time, requires careful attention to the important distinction between micro-level data and aggregate measures.


Introduction
Our recent article (Gaines & Taagepera, 2013) makes four simple, closely related points: (1) the modal analysis of conformity with Duverger's Law in district-level competition involves assessment of whether the effective number of parties is, roughly, two; (2) this criterion seems not to match intuition of at least some scholars experienced in working with electoral data, and yields some arguably counter-intuitive findings; (3) rival indices might better capture what is meant by two-party competition at the level of the district; and (4) inspection of values of these indices for a set of districts can be a useful tool, among others, in making an assessment of whether a given election or electoral system is Duvergerian. Dunleavy (2014) has quickly produced a lengthy "comment" alleging to demonstrate critical faults in each index, and urging others not to employ them in any future work. Here, we briefly explain why we find his critique unpersuasive.
Which is a better example of two-party competition, 90-5-5 or 90-10-0? We think neither is at all well described as "two-party", and the colleagues we surveyed agreed. The indices we developed accordingly assign comparatively low scores to both outcomes. Dunleavy's proposed "gap" index, by contrast, converted them into its lowest and highest possible scores, 0 and 100, respectively. Although 90-10-0 features exactly two competitors with non-zero vote totals, it is not a literal adherence to counting positive vote shares that drives this contrast. The gap also treats 84-7-7-2 (gap¼0) and 84-14-1-1 (gap¼92.9) as highly divergent. If these values seem fine, Dunleavy's critique of our work may resonate. Those who find these scores worrying, however, will, if they work through the whole of Dunleavy's comment, discover that how to interpret highly lopsided races of this sort is at the heart of our disagreement. Dunleavy (2014) opens with a general complaint that electoral analysis is "plagued" by poorly conceived indices, and lists six conditions for problems. This framework turns out to be irrelevant to the remainder of the piece, as all of the conditions are plainly vague and subjective. Whether or not an index is understood by its author, supported by "exaggerated" claims for its utility, described in "complete and accurate mathematical" terms, and so on are points on which reasonable people will frequently differ. Whatever one thinks of the gap measure that Dunleavy eventually broaches, we are confident that few will concur that it somehow passes these six tests while our indices fail.
Where, then, do our indices go wrong? We will set aside the reiterated claim that we repeatedly made over-ambitious claims of applicability: we were eminently clear that we see our indices as only two possibilities for measuring district-level competition in votes, and not the last word even in that limited domain. So too the charges that our description was "extremely sketchy" and that we introduced too little empirical work in support of the indices amount to Dunleavy ignoring inherent space limitation for journal articles or, at minimum, valuing brevity little.

Does 152?
Focus on the specific complaints made about the indices. On a skim, it might seem that Dunleavy raises a large number of troubling properties; in fact, most of the complaints involve this difference in taste for how to interpret cases with a dominant party. A related point is that Dunleavy exhibits a preference for conflating districtlevel and aggregate-level judgments. We regard dominant performance by a single party to be akin to one-party competition, whereas Dunleavy, though he never quite says so, thinks "two-party" inherently means "at most two-party" and thus should include (many) one-party-dominant cases.
Hence, Dunleavy's prolonged complaint that T is "internally complex" and subject to two forces and thus not "a single well-behaved index" (2014, p. 4) amounts to a charge that no one should ever treat all outcomes with very large top shares as deviations from two-party competition. Is it in fact "perverse" to score 69 -30-1 as a worse fit to two-party competition than 30-30-1-. . . .-1? Dunleavy thinks so; we do not. We noted explicitly that those "who understand Duverger's prediction to have been 'no more than two parties competing' . . . " will be troubled by our D 2 index (2013, p. 400), so Dunleavy's first complaint is not so much a discovery as a repackaging of an issue we already noted, except that he raises the stakes by assuming that "two-party", even when not motivated by Duverger's law, must always encompass many one-party cases.
Is the discipline united in its view that two-party really means no-more-than-twoparty? We think not, and we can cite the small data set of experts' ratings discussed in our paper to prove that we are not alone (see the top panel of our Figure 1 (2013, p. 393)). Our small study recorded a few scholars' impressions of cases with no more than three competitors and, a larger, more systematic investigation of how experts rate even more scenarios would be most welcome. Dunleavy, however, is comfortable making claims on behalf of "almost all political scientists" with no supporting evidence (2014, p. 10).

Does 252?
A second point made by Dunleavy is that our indices over-emphasize the significance of the two-way, 50-50% tie. We are said to have made an "unevidenced and counterintuitive decision to privilege a (50, 50) configuration as somehow the heart of 'two partyness'" (2014, p. 13). Intuitions vary, of course, and being "unevidenced" is a puzzling accusation about a theoretical, not empirical, claim. An exact two-way tie is obviously a pure case of two-party competition, so we make no apologies for devising indices that return maximum values from that input. Neither D 2 nor T produces a dichotomous categorization of outcomes, scoring 50-50 as two party and all others as "not". We treat first-place ties with scattering votes as different in degree, not kind, from the precise two-way split. In contrast to our clear and almost tautological standard, Dunleavy prefers the metaphor of "a premier league" of two, plus others. That is not an operationalization, and his gap measure does not match that loose account in any case, as we show below. In our empirical examples, we employ arbitrary thresholds at which to dichotomize (noting the arbitrariness explicitly). In that step, as we discard the variance in index scores within large ranges, we do exactly the opposite of what Dunleavy sees as the folly of over-stressing 50-50.

Sundry Points
It would be tedious to revisit each and every claim made by Dunleavy. A few are demonstrably false. We did not, as a general matter or in our empirical examples, assume "that there are very large numbers of parties on infinitely small vote shares" (2014, p. 7). Nor is true that " . . . in the course of their exposition Gaines and Taagepera repeatedly cite examples with only two parties competing . . . " (2014, note 3, 27). Except for a few mentions of the 50-50 case, we discussed no two-way splits in the exposition. (Some of these did feature in the survey given to colleagues, which appears in the appendix.) In quoting a passage about T, Dunleavy inserted needless "sic"s (2014, p. 14), apparently missing its point that T is a function of the vote shares of the top three parties and it is thus implicitly sensitive to the total vote won by all others, but not to the allocation of that vote across the 4th through kth parties.
The four 10-seat examples set forth in Dunleavy's Figure 3 and Tables 2, 4, and 5 lead to his comparing summary statistics (mean, median, standard deviation) for the distributions of T and D 2 , ignoring the empirical precedent in our article, where we focus on proportions of cases having high T and D 2 values. And, again, Dunleavy's insistence that one-party cases constitute an "integral aspect of strong two party competition" (2014, p. 10) amounts to either yet another reassertion that "two" can only mean "one or two" or a conflation of distinct claims about district-level and aggregate competition.
Other claims are unclear because of Dunleavy's penchant for eschewing precise technical language. Note 3 (2014, p. 27) says that the T index delivers scores that are "highly suppressed", an obscure claim even if one assumes that Dunleavy actually meant "compressed". Dunleavy counsels that, "analysts should always consider their data carefully (for instance by always plotting them against the largest vote share)" (2014, p. 2). While confusing "data" with indices produced from data, he also offers no rationale for this diagnostic procedure. Our indices, the effective number of parties and related entropy-based measures, Dunleavy's gap, and all of the other related indices convert compositional vectors into scalars. The largest vote (or, in other contexts, seat) share is one component of the input, but there is no clear reason to assert that an index is "well behaved" if it "runs consistently from 0 to 1" as a function of that share (2014, p. 15). A more careful statement would emphasize whether or not that function is continuous, differentiable, twice continuously differentiable, monotonic, etc. But absent an argument for focusing on the bivariate index-top-share association, even a properly worded claim would not be compelling. Figure 5 is said to illustrate that the minimum attainable D 2 score depends completely on the number of parties and the top finisher's vote share. In fact, the logical range of D 2 , unlike that of the effective number, is unaffected by the number of competitors. The asymmetry in how D 2 penalizes departures from two-way competition in the one-and multi-party directions was noted explicitly in our original paper (2013, p. 392), so Dunleavy's discovery of this point is, again, unoriginal. One who thinks exact three-, four-, . . . , k-way ties should all be gauged equally far from two-way competition can construct an index with that property. We chose not to, and noted as much. The extra ingredient in Dunleavy's complaint is that he imposed a minimal-dispersion constraint via assumptions of the form "at least x of the N parties win at least y percent of the vote". That sort of ad hoc constraint does, indeed, truncate the logical ranges of our indices, and those of the gap and the effective number. Exactly how sensitive should a measure of two-partyness be to constraints of this form? The latter question is unexplored by Dunleavy, who merely implies that no index should be affected by any such constraints.

Never Mind the Gap?
Dunleavy eventually suggests that an alternative index can prove useful in assessing fit to "an eclectic, multi-criteria concept of two-partyness" (2014, p. 24), namely: where the vote share (p) terms are expressed as percentages and N is the number of contenders having won at least one percent of the vote. Since Dunleavy asserts that this measure is "useful", "simple (and completely understood)" (2014, p. 24), we briefly examine its properties. We suggested above that the case for plotting an index against top-vote share is unclear, but Figure 1 shows, just the same, no obvious pattern in the gap v. topshare plot. (Here, we again plot outcomes from the 46,262 five-component partitions of the integer 100, with an additional panel showing only the 38,225 of these in which all five shares are non-zero, in deference to Dunleavy's preference for imposing minimum-dispersion constraints.) More illustrative is Figure 2, which plots gap against the effective number and our two indices, in three panels. In the latter two panels, we superimpose arbitrary thresholds as possible cut-points for dichotomizing, and note that, in this mode of analysis, it is possible to select a value for gap that more or less ensures conformity with both T and D 2 , despite their alleged vices, with the exception of the top left quadrant, wherein are found one-party-dominant cases. The gap regards the contrast between such pairs as 86-11-1-1-1 and 86-6-6-1-1 (both of which have N of roughly 1.3) as gigantic, with the former a nearly perfect example of two-party competition (gap¼90.9) and the latter its antithesis (gap¼0).

What Next?
Dunleavy's comment does not strike us as laudably precise or efficient at presenting empirical content, but the sheer quantity of condescending, snarky language was impressive. In the spirit of index construction, we would suggest that anyone seeking to develop a dictionary of shrillness for usage in automated text analysis would find this comment useful. Our work is: "extremely sketchy", while we "contrive" to suggest that our indices are error-free (2014, p. 3); our misrepresentations come "repeatedly" (2014, p. 3, 4, 22, 27), produce "oddball results" (2014, p. 7) with indices that "perform poorly", generate "misleading scores and completely misread 'two-partyness'" (2014, p. 11), and so on, page after page. In the business of hurling insults, Dunleavy is clearly in the premier league. We suspect that the tone of indignation is partly strategic: a sounder demonstration of error could have been made quite succinctly, and without the aggrieved tone.
Tone aside, we appreciate that Dunleavy took the time to read our paper. Figure 4, covering three elections from three countries, is rather interesting (though the final panel ought to have a lower limit on the x-axis of 1, not 0, to respect logical bounds). The panel for the US House shows clearly that, at the district level, oneparty competition is routine (and it would show that effect even more clearly with some jittering at the (1,0) point, given the frequency of uncontested races). For Dunleavy, " . . . we know empirically" (2014, p. 14) that our index is perverse to miscode the US House as not fitting the two-party label well, but that point again conflates impressions based on district votes, aggregate votes, and seats.
For any reader intrigued, puzzled, or amused by the nature of this controversy, our final word is simple. We agree with Dunleavy that two-party competition is deceptively complex, and we do not pretend to have produced perfect indices. The Laakso-Taagepera effective number of parties is, of course, also very useful, but not especially so for the particular purpose of assessing district-level competition, our focus in the original article. We think our two novel indices will prove to be useful tools for students of elections, and we invite others to try them out, experiment with the gap if they are so inclined and perhaps offer further refinements.