Using atmospheric forcing data generated from a general circulation climate model, sixteen land surface schemes participating in the Project for the Intercomparison of Land-surface Parametrization Schemes (PILPS) were run off-line to equilibrium using forcing data from a GCM representative of a tropical forest and a mid-latitude grassland grid point. The values for each land surface parameter (roughness length, minimum stomatal resistance, soil depth etc.) were provided. Results were quality controlled and analyzed, focusing on the scatter simulated amongst the models. There were large differences in how the models' partitioned available energy between sensible and latent heat. Annually averaged, simulations for the tropical forest ranged by 79 W m-2 for the sensible heat flux and 80 W m-2 for the latent heat flux. For the grassland, simulations ranged by 34 W m-2 for the sensible heat flux and 27 W m-2 for the latent heat flux. Similarly large differences were found for simulated runoff and soil moisture and at the monthly time scale. The models' simulation of annually averaged effective radiative temperature varied with a range, between all the models, of 1.4 K for tropical forest and 2.2 K for the grassland. The simulation of latent and sensible heat fluxes by a standard 'bucket' models was anomalous although this could be corrected by an additional resistance term. These results imply that the current land surface models do not agree on the land surface climate when the atmospheric forcing and surface parameters are prescribed. The nature of the experimental design, it being offline and with artificial forcing, generally precludes judgements concerning the relative quality of any specific model. Although these results were produced de-coupled from a host model, they do cast doubt on the reliability of land surface schemes. It is therefore a priority to resolve the disparity in the simulations, understand the reasons behind the scatter and to determine whether this lack of agreement in decoupled tests is reproduced in coupled experiments.