Stocks of soil organic carbon represent a large component of the carbon cycle that may participate in climate change feedbacks, particularly on decadal and centennial timescales. For Earth system models (ESMs), the ability to accurately represent the global distribution of existing soil carbon stocks is a prerequisite for accurately predicting future carbon–climate feedbacks. We compared soil carbon simulations from 11 model centers to empirical data from the Harmonized World Soil Database (HWSD) and the Northern Circumpolar Soil Carbon Database (NCSCD). Model estimates of global soil carbon stocks ranged from 510 to 3040 Pg C, compared to an estimate of 1260 Pg C (with a 95% confidence interval of 890–1660 Pg C) from the HWSD. Model simulations for the high northern latitudes fell between 60 and 820 Pg C, compared to 500 Pg C (with a 95% confidence interval of 380–620 Pg C) for the NCSCD and 290 Pg C for the HWSD. Global soil carbon varied 5.9 fold across models in response to a 2.6-fold variation in global net primary productivity (NPP) and a 3.6-fold variation in global soil carbon turnover times. Model–data agreement was moderate at the biome level (R2 values ranged from 0.38 to 0.97 with a mean of 0.75); however, the spatial distribution of soil carbon simulated by the ESMs at the 1° scale was not well correlated with the HWSD (Pearson correlation coefficients less than 0.4 and root mean square errors from 9.4 to 20.8 kg C m−2). In northern latitudes where the two data sets overlapped, agreement between the HWSD and the NCSCD was poor (Pearson correlation coefficient 0.33), indicating uncertainty in empirical estimates of soil carbon. We found that a reduced complexity model dependent on NPP and soil temperature explained much of the 1° spatial variation in soil carbon within most ESMs (R2 values between 0.62 and 0.93 for 9 of 11 model centers). However, the same reduced complexity model only explained 10% of the spatial variation in HWSD soil carbon when driven by observations of NPP and temperature, implying that other drivers or processes may be more important in explaining observed soil carbon distributions. The reduced complexity model also showed that differences in simulated soil carbon across ESMs were driven by differences in simulated NPP and the parameterization of soil heterotrophic respiration (inter-model R2 = 0.93), not by structural differences between the models. Overall, our results suggest that despite fair global-scale agreement with observational data and moderate agreement at the biome scale, most ESMs cannot reproduce grid-scale variation in soil carbon and may be missing key processes. Future work should focus on improving the simulation of driving variables for soil carbon stocks and modifying model structures to include additional processes.