The development of models to capture large-scale dynamics in human history is one of the core contributions of cliodynamics. Most often, these models are assessed by their predictive capability on some macro-scale and aggregated measure and compared to manually curated historical data. In this report, we consider the model from Turchin et al. (2013), where the evaluation is done on the prediction of “imperial density”: the relative frequency with which a geographical area belonged to large-scale polities over a certain time window. We implement the model and release both code and data for reproducibility. We then assess its behavior against three historical datasets: the relative size of simulated polities versus historical ones; the spatial correlation of simulated imperial density with historical population density; and the spatial correlation of simulated conflict versus historical conflict. At the global level, we show good agreement with population density (R2<0.75), and some agreement with historical conflict in Europe (R2<0.42). The model instead fails to reproduce the historical shape of individual polities. Finally, we tweak the model to behave greedily by having polities preferentially attacking weaker neighbors. Results significantly degrade, suggesting that random attacks are a key trait of the original model. We conclude by proposing a way forward by matching the probabilistic imperial strength from simulations to inferred networked communities from real settlement data.