The Pitfalls of Using Ancient Population, Army and Casualty Data without Expert Curation A Review of Oka et al. 2017

The Pitfalls of Using Ancient Population, Army and Casualty Data without Curation: A Review of Oka et al. 2017. The historical turn in the social sciences has been neglected by historians. This has caused social scientists to use much data which has not been curated by experts focused on the relevant time periods and geographic locations. A recent article by Oka et al. investigating the important question of historical trends in violence is a good example. A detailed survey of Oka et al.’s Persian, Greek and Roman population, army size and casualty data reveals several problems. The uncertainty in ancient data, especially casualty figures, has been underappreciated by Oka et al. In population and army size data, some speculative and dependent data points have been treated as independent. There are also inconsistencies in the data and some inflated figures. The situation is worse for the ancient army size and casualty figures for individual battles used by Oka et al., which suffer from systematic biases designed to magnify the achievements of the historian's own culture. This is clearly illustrated by the main battles of Alexander the Great against the Persians, in which Alexander's forces, although greatly outnumbered, are supposed to have inflicted hundreds or thousands of times more casualties that they sustained. These issues demonstrate the importance of curation of such data by scholars focused on the relevant time periods and cultures, and we recommend that historians become actively involved in such research.


Introduction
Comparison between large of amounts of partial or complicated historical data is increasingly the focus of social-scientific study. This "historical turn" in the social sciences has come at a time when history has largely turned away from quantitative and comparative data analysis (Klein 2017;Schweber 2001). The 55 result has been that historical data on key social topics have been treated solely by social scientists, who have often underappreciated the historical context of the data. Since this has the potential to skew their results, and hence the modern, realworld policy and other recommendations arising from these results, we argue that historians must not remain silent, but re-enter this field, bringing their detailed knowledge of historical context and source-critical considerations, and work together with social scientists in the analysis of historical comparative data. Recent efforts within history (Kohler et al. 2017;Turchin et al. 2018) have involved large teams of specialists, each compiling data in their own area of specialism. Tellingly, the leadership of these two research programs has included significant representation from natural scientists and archaeologists, respectively.
The need for engagement applies equally to ancient historians, due to the different needs of the ancient data available. Recent historical research has emphasized the major differences between ancient and modern historical data (Shaw 2010), particularly in economic data, and the lack of Big Data from the ancient period (Fourie 2016).
A recent example of historical social-scientific research, published in the Proceedings of the National Academy of Sciences by Oka and co-authors (2017), illustrates all these issues clearly, particularly in its use of ancient historical data. Oka et al. are concerned to reconstruct historical trends in violence, an important yet intractable problem. They have compiled and compared parameters of conflicts of a "historical" dataset largely dating between 2500 BCE and 1690 CE, as well as a larger set of more recent data, categorized as "contemporary," "ethnographic" and "massive conflict." Oka et al. find that the proportions of both war group size and short-term conflict deaths relative to population decrease as population increases, as expected. The proportion (C/W) of total deaths in long-term conflicts (C) relative to war group size (W) increases with increasing war group size, however, which they did not expect. Oka et al. are to be complimented for their timely, datadriven corrective to complacency regarding future conflicts, but there are serious issues with the compilation, use and referencing of the historical data used in their paper. Many of these issues could have been avoided by the incorporation of a wider range of specialists within their team.

Referencing
Overall, checking and reuse of this comparative dataset is complicated by the fact that Oka et al. do not reference the source of each data point, but provide only a generalized reference list for each whole dataset. Historical data comprise a small but significant portion (17%) of their Dataset S1: 49 points out of a total of 295. We could not link any of the Greco-Roman data points to a specific reference in Dataset S1's reference list (which is found in Dataset S9). As an indication of recent 56 best practice, we would point to the Seshat database (Turchin et al. 2018), whose individually referenced data is openly accessible online (seshatdatabank.info/).

Polity Population and War Group Size Data
There are significant questions regarding the accuracy and uncertainty of Oka et al.'s historical population and war group size data. The lack of referencing and explanation means many of these questions cannot be answered without contacting the authors. One example is how the single "Greek City State" (400 BCE) entry was calculated, given the extraordinary variation between the approximately 1,500 Greek city-states known (Hansen 2006: 1). Presumably some kind of average is meant, but it is unclear how, and from which data, it was calculated. Oka et al.
give a population of 20,000 for their "Greek City State" entry. The sizes of Greek city-states varied greatly, but we have no firm data for the population of almost all of them, with the exception of a few anomalous examples such as Athens (Hansen 2006: 1). Ideal sizes for the adult male citizen population, probably less than a quarter of the total, were 1,000 (in the Republic) and 5,040 (in the Laws) according to Plato and 10,000 according to Aristotle (Morris 1989: 5;Hansen 2006: 74, 108). A recent reconstruction based on estimated household sizes and city areas suggests around half of all poleis had total populations below 5,000 (Hansen 2006: 73-86). Some were smaller than this, but, at the other end of the scale, estimates of peak male citizen population at Athens in the fifth century BCE (one of our better documented cases) range from 40,000 to 60,000 and drop to 21,000-30,000+ in the mid-fourth century (Morris 1989: 5;Scheidel 2007Scheidel : 45, 2008Hansen 2006: 12, 91-92, 108). The male citizen population for early fifth-century Sparta has been estimated at 8-25,000. Once women, slaves and other non-citizens are included, the full populations of these larger city-states are estimated at, for Athens, 250-300,000 (fifth century BCE) and 150-200,000 (mid-fourth century BCE), and for Sparta, up to 125,000 (fifth century BCE). Recent work suggests that more poleis had populations in the tens and hundreds of thousands than is usually thought, but great uncertainty remains (Hansen 2006: 75-76) and will continue for the foreseeable future. In the context of this variability and uncertainty, we argue that a "Greek City-State" entry should be replaced by individual points for betterdocumented examples such as Athens and Sparta. This was the approach taken in works by Chandler in 1987and Modelski in 2000, whose data are in current use (e.g. Reba et al. 2016).
Oka et al. are certainly aware that their population (P) and war group size (W) data for "historical states might be less reliable, as they tend to be drawn from contemporaneous accounts or later scholarly reconstruction" than contemporary or ethnographic data (Supporting Information, p.2), and they argue against the use of prehistoric data because of "the absence of reliable data on population or war group sizes" (p. E11109). We argue that they have underestimated, however, the great uncertainty in estimates of ancient historical population (Scheidel 2007: 42) and war group size (MacMullen 1980: Table S1), and that, if they had included an ancient historian, they might have excluded some of the ancient historical data along with the prehistoric data. For example, estimates of the peak population of the Roman Empire-maybe sometime in the second century CE before the start of  the Antonine Plague (166 CE)-cluster around 55-75 million, but values of 100 million and greater are found in modern scholarship ( Figure 1, Table S1). Oka et al. give an average population of 75 million for the Roman Empire (in 100 CE), which accords well with the modern estimate of the peak population given above. Greater resolution than this is speculation, given there are no surviving ancient figures to go on. Nevertheless, Oka et al. have eight further Rome points and one Byzantine Empire point between 24 and 337 CE. All these are at least 13 million less than the "average" figure (Figure 1). Increasing the resolution through interpolation is found in other similar datasets (e.g. for the population of Rome in the work of Modelski but not of Chandler, see Reba et al. 2016), but we feel it should used sparingly (for example, where comparison between different locations/cultures at a particular timepoint is particularly important), and not merely to provide additional data points. Such interpolation is particularly problematic in the ancient period, where sparse data entail considerable assumptions about the shape and magnitude of growth and decline. This artificially inflated resolution is also found in the Roman War Group Size figures ( Figure S2), albeit to a lesser degree, since there is a little more information in the ancient sources. Here it leads to anomalies such as a drop in the size of the Roman military of 184,000 (32%) between 305 and 337 CE (Table S1), without justification in the ancient sources, which, in any case, must often be wildly inaccurate in this period (MacMullen 1980). Even more puzzling is the relationship between their data point for the Roman Empire in 300 CE (described as "Byzantine," also in Table S1) and that for 305 CE (described as "Rome," also in Table S1), which has double the population (Figure 1) and war group size ( Figure  S2) of that in 300. Perhaps the Byzantine value only counts the eastern half of the empire, but most scholars would not consider the eastern empire a separate polity in this period (e.g. MacMullen 1980;Lee 2007) and there is no historical reason why the eastern half should be treated as a separate polity in 300 but not in 305.
When we come to the Persian Empire, the population size ( Figure S1, Table S1) used by Oka et al. is double or triple modern estimates (Wiesehöfer 2009: 76-77;Scheidel 2007: 45) and their war group size is also too large (see comments regarding Herodotus below). The resolution is more appropriate, however, as only one point is included.

Conflict War Group Size and Casualty Data
The Greco-Roman war group size and casualty figures for individual conflicts (85 out of 430 points in Dataset S2) present a greater difficulty than population and overall war group size (Dataset S1, discussed above). Such figures are not considered reliable due to systematic biases and exaggerations and a lack of reliable primary documentary evidence. For example, modern scholars have demonstrated how Greco-Roman writers often minimized their own casualties and troop totals, while exaggerating those of their opponents, in order to magnify Greco-Roman achievements and conform to literary models (Wiesehöfer 2009: 66-70, 76-77;Brunt 1976;Rubincam 1991;Lange 2011). Even if some Greek and Macedonian casualty figures preserved by Greek writers could have a documentary basis (Hammond 1989;Krentz 2004), most seem not to, and there is generally no surviving data from sources written by the opponents of Greco-Roman armies. A good example is Herodotus (Kelly 2003;Rubincam 1991: 181), who is cited by Oka et al. (Dataset S9 #35) and is likely behind their improbably large figure of 2.5 million for the Persian army (Table S1). Even Thucydides (cited by Oka et al. at Dataset S9 #29), conspicuous among ancient historians for his rigorous methodology applied to contemporary events (Krentz 2004: 13-14), offers army and casualty data with a suspicious preponderance of conventional figures such as 200, 300 and 1000 (Rubincam 1991), probably because there was no single tally kept of the dead from Athenian battles (Rubincam 2018).
Modern scholarship has been able to salvage useful information from some (but not all, see below) of such figures. The inclusion of ancient historians in their team would have enabled Oka et al. to locate more up-to-date and in-depth treatments. Many of their sources on Greco-Roman history in Dataset S2 are dated (Dataset S9 #18, first published in 1890, and #24, published 1881) or else are general overviews or encyclopedias (such as Dataset S9 #8, #19, #30 and #51).
To see how Oka et al. treated these problematic data, we conducted a detailed investigation of the three main battles fought between Alexander the Great's Macedonian army and the Persian army. This case study clearly illustrates the systematic biases discussed above: where data survives to be plotted in Figures 2-4, the light blue Persian troop numbers are always much larger than the dark blue Macedonian ones, and the dark red Macedonian casualties are not even visible compared to the pink Persian casualties. The ancient sources (who were largely drawing upon contemporary accounts) give data ranging over two orders of magnitude in each category, and this also applies when C/W values are calculated from their data (Table S2). As mentioned above, the relationship of C/W to W is one of their major findings. For none of these three battles do the data seem trustworthy enough to justify a data point. Trying to rehabilitate such distorted figures requires considerable speculation and the results could well be greatly in error: it is impossible to know.
Arrian provides the only complete set of figures for the Battle of the Granicus River ( Figure 2 the entire dataset. There is just too much uncertainty associated with these figures to use them in this way. For the Battle of Issus, there is no single ancient figure for Alexander's war group size, and multiple assumptions need to be made about the changes in Alexander's troop numbers between Granicus and the Battle of Gaugamela (Brunt 1976: lxxi). The results of such a speculative calculation, when combined with the Persian troop totals given in the sources, imply that the Macedonians were outnumbered approximately 10:1. Oka et al. seem to agree with Bosworth (1980: 209) that Persian troop numbers "seem exaggerated" in all sources, as they use a figure for total war group size that is less than half that of the lowest figure for the Persian forces only (Figure 3). Given Persian to Macedonian casualty ratios (Table  S2) of 244:1 (Diodorus), 254:1 (Justin) and 604:1 (Curtius), Bosworth (1980: 217) also rightly comments that "both Persian and Macedonian losses are propaganda figures." Since this means that Macedonian casualties should be increased and Persian decreased, it is unclear how the totals preserved in the ancient sources should be modified. Oka et al., presumably following a published scholar, use a figure that is less than half the smallest found in the ancient sources (Figure 3).
For the Battle of Gaugamela (Figure 4), Arrian, normally regarded as a comparatively good source (Bosworth 1980; described as "the best evidence we have for Alexander" by Brunt 1976: xvi-xvii), states that the Persian forces outnumbered the Macedonians by a factor of 22, but nevertheless suffered 3,000 times the Macedonian casualties. Bosworth (1980: 312) describes these casualty figures as "propaganda figures, so remote from reality that no conclusions about the actual losses are possible." The C/W range from the ancient sources here is   figure available for Macedonian War Group size (that of Arrian). It is generally agreed that the Macedonian War Group was smaller than the Persian War Group, so substituting this figure is not thought to have a significant effect on C/W. problems exist for many of the other historical data points in their datasets. Such issues are to be expected from even excellent scholars when they are operating outside their field. We urge those planning such comparative data projects to include historians specializing in the particular areas from which their data is drawn, and urge historians not to be missing in action in the field of quantitative historical comparison, to prevent well-intentioned misuse of historical data.