We present a systematic comparison of tropospheric NO2 from 17 global atmospheric chemistry models with three state-of-the-art retrievals from the Global Ozone Monitoring Experiment (GOME) for the year 2000. The models used constant anthropogenic emissions from IIASA/EDGAR3.2 and monthly emissions from biomass burning based on the 1997–2002 average carbon emissions from the Global Fire Emissions Database (GFED). Model output is analyzed at 10:30 local time, close to the overpass time of the ERS-2 satellite, and collocated with the measurements to account for sampling biases due to incomplete spatiotemporal coverage of the instrument. We assessed the importance of different contributions to the sampling bias: correlations on seasonal time scale give rise to a positive bias of 30–50% in the retrieved annual means over regions dominated by emissions from biomass burning. Over the industrial regions of the eastern United States, Europe and eastern China the retrieved annual means have a negative bias with significant contributions (between –25% and +10% of the NO2 column) resulting from correlations on time scales from a day to a month. We present global maps of modeled and retrieved annual mean NO2 column densities, together with the corresponding ensemble means and standard deviations for models and retrievals. The spatial correlation between the individual models and retrievals are high, typically in the range 0.81–0.93 after smoothing the data to a common resolution. On average the models underestimate the retrievals in industrial regions, especially over eastern China and over the Highveld region of South Africa, and overestimate the retrievals in regions dominated by biomass burning during the dry season. The discrepancy over South America south of the Amazon disappears when we use the GFED emissions specific to the year 2000. The seasonal cycle is analyzed in detail for eight different continental regions. Over regions dominated by biomass burning, the timing of the seasonal cycle is generally well reproduced by the models. However, over Central Africa south of the Equator the models peak one to two months earlier than the retrievals. We further evaluate a recent proposal to reduce the NOx emission factors for savanna fires by 40% and find that this leads to an improvement of the amplitude of the seasonal cycle over the biomass burning regions of Northern and Central Africa. In these regions the models tend to underestimate the retrievals during the wet season, suggesting that the soil emissions are higher than assumed in the models. In general, the discrepancies between models and retrievals cannot be explained by a priori profile assumptions made in the retrievals, neither by diurnal variations in anthropogenic emissions, which lead to a marginal reduction of the NO2 abundance at 10:30 local time (by 2.5–4.1% over Europe). Overall, there are significant differences among the various models and, in particular, among the three retrievals. The discrepancies among the retrievals (10–50% in the annual mean over polluted regions) indicate that the previously estimated retrieval uncertainties have a large systematic component. Our findings imply that top-down estimations of NOx emissions from satellite retrievals of tropospheric NO2 are strongly dependent on the choice of model and retrieval.