Upscaling of field-scale soil moisture measurements using distributed land surface modeling

Accurate coarse-scale soil moisture information is required for robust validation of current- and next-generation soil moisture products derived from spaceborne radiometers. Due to large amounts of land surface and rainfall heterogeneity, such information is diﬃcult to obtain from existing ground-based networks of soil moisture sensors. Using ground-based ﬁeld data collected during the Soil Moisture Experiment in 2002 (SMEX02), the potential for using distributed modeling predictions of the land surface as an upscaling tool for ﬁeld-scale soil moisture observations is examined. Results demonstrate that distributed models are capable of accurately capturing a signiﬁcant level of ﬁeld-scale soil moisture heterogeneity observed during SMEX02. A simple soil moisture upscaling strategy based on the merger of ground-based observations with modeling predictions is developed and shown to be more robust during SMEX02 than upscaling approaches that utilize either ﬁeld-scale ground observations or model predictions in isolation.


Introduction
No passive spaceborne soil moisture sensor in the foreseeable future will have a ground spatial resolution finer than 30-km. Current soil moisture observations from the advanced microwave scanning radiometer (AMSR) sensor aboard the NASA AQUA satellite, for instance, are derived from radiometer observations with a À3 dB resolution of $60 km. Given the magnitude of heterogeneity typically observed in surface soil moisture fields [1,36,15,16], ambiguities associated with upscaling point-scale observations to spaceborne radiometer footprint scales have emerged as a major chal-lenge in attempts to validate remote sensing soil moisture retrievals [18,10].
Even extensive soil moisture networks like the Oklahoma Mesonet, the Illinois Water Survey, and the Southern Great Plains ARM-CART system have an average site spacing greater than 30 km and will provide, at best, a single observation within a given footprint. Networks with denser soil moisture sampling locations typically cover only a fraction of a radiometer footprint and will be vulnerable to extrapolation error in the presence of heterogeneous rainfall. Some of these difficulties can be mitigated through optimized interpolation and site selection approaches. Block-kriging techniques, for instance, allow for the optimal interpolation of pointscale measurements based on a spatial fieldÕs auto-correlation structure. This possibility has spawned interest in accurately measuring and/or generalizing the spatial structure of soil moisture fields under various hydrologic 0309-1708/$ -see front matter Ó 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.advwatres.2004. 10.004 conditions [12,40,38,25]. There is also growing evidence that surface soil moisture fields, due to the static influence of soil, vegetation, and topography, exhibit temporally persistent spatial patterns at local scales (<5 km) [37,18,26,22]. Such persistence, or time stability, can be exploited by selecting measurement sites which consistently reflect soil moisture conditions over a wider geographic area. However, the upscaling skill of timestability methodologies is reduced by the impact of spatially heterogeneous rainfall [10] which may reduce itÕs effectiveness when applied within coarser footprint scales (>10 km).
In addition to ground-based networks, an alternative source of surface soil moisture is distributed land surface modeling. Such models can synthesize spatially distributed rainfall, land use, soil, and topographic maps to produce surface soil moisture predictions over large-spatial areas. Because they are distributed in nature, these predictions do not suffer from the same spatial support and sampling density inadequacies as ground-based networks. However, the use of unconstrained model output as a source of validation data is likely to be problematic. Reasons for skepticism include well-known errors in spatial patterns of observed rainfall [34] and soil texture fields [44] typically used to force models, difficulties surrounding proper model calibration and parameter identification [4,20,21], and the inability of current observing systems to measure some key model inputs (e.g. wind speed and relative humidity) at fine spatial scales (<10 km). In addition, absolute levels of modeled soil moisture have been shown to be highly model dependent [24,13]. This implies that model representation of relative space/time patterns may be more meaningful than predictions of absolute soil moisture magnitudes. However, to date, relatively few validation studies have focused explicitly on evaluating spatially distributed predictions from land surface models [19].
A third possibility are approaches based on a combination of distributed modeling and local soil moisture observations. A range of possible strategies exist including data assimilation and model calibration strategies. But the basis of each is the presumption that model output, at the very least, contains basic spatial information about the relative relationship between soil moisture at a given measurement location and spatially averaged soil moisture within some larger regional area. If this is true, the relative spatial patterns predicted by the model can be integrated with sparse ground-based observations to improve estimates of footprint-scale soil moisture averages. Such integrated estimates will be more accurate than unconstrained model predictions if the relative patterns of soil moisture predicted by models prove more robust to modeling uncertainty than predictions of absolute soil moisture.
Intensive soil moisture sampling conducted during the Soil Moisture Experiment in 2002 (SMEX02) be-tween June 25 and July 12, 2002 in central Iowa provides an unique opportunity to test aspects of this hypothesis and examine the potential role of land surface modeling in upscaling local soil moisture observations. Specifically, this analysis will evaluate the degree to which a land surface hydrology model can accurately reproduce surface (0-6 cm) soil moisture heterogeneity and spatial patterns observed in extensive ground-based soil moisture observations made during SMEX02. Basic upscaling strategies that use TOPLATS simulations to upscale local-scale observations to footprint-scale (>30 km) soil moisture means will be evaluated based on their potential as validation strategies for coarsescale spaceborne soil moisture retrievals. As a first step, this analysis will focus primarily on upscaling field-scale (800 m) soil moisture observations. However, prospects for upscaling point-scale observations will also be discussed.

Land surface modeling
Land surface modeling was based on TOPmodelbased Land Atmosphere Transfer Scheme (TOPLATS) [14,28] predictions over the 6378 km 2 regional domain displayed in Fig. 1. A model grid size of 90-m, requiring approximately 800,000 individual pixels, was used for all simulations. Several modifications were made to the model relative to the baseline version described in [28]. Most critically, the two-layer soil water balance was expanded to four layers. Calculations of diffusive and gravity drainage fluxes retain the same numerical form. However, these fluxes are now calculated for each of four soil layers and simultaneously balanced using a semi-implicit numerical scheme. The modification requires new user inputs of depths for the top three soil layers (the fourth soil layer is bounded at the bottom by a dynamic water table depth) and the specification of areal rooting fractions in all four layers. Results for the new four-layer version of TOPLATS are also reported in [11].
A second modification was made to allow for the calculation of separate soil and canopy contributions to evapotranspiration within TOPLATS grid elements. Previous versions of TOPLATS required that grid elements be characterized as either solely bare soil or solely vegetated with direct soil evaporation neglected in vegetated pixels. In reality, soil evaporation plays a significant role in determining surface soil moisture under sparse canopies and between crop rows. To capture this, total grid cell evapotranspiration (E T ) was calculated as: where f v is the vegetated fraction of the grid cell, T is the transpiration calculated from the vegetated fraction of grid cell, and E is direct soil evaporation from the bare soil portion of the grid cell. Vegetated fraction is calculated from normalized difference vegetation index (NDVI) observations and the approach of [7] which predicts: Variables NDVI min and NDVI max are the maximum and minimum NDVI values observed within a scene. Within scene variability is assumed to be sufficient such that these values accurately represent NDVI levels for full vegetative cover (f v = 1) and bare soil (f v = 0) conditions. The parameter p is formally defined as a function of leaf angle distribution and solar radiation extinction within the canopy, however it is often calibrated within a typical range of between 0.5 and 0.7. Bare soil evaporation (E) is calculated using the soil resistance approach described in [28]. Transpiration (T) is calculated as: where T max i is the maximum rate of transpiration sustainable given the moisture status of soil layer i, q i is the relative fraction of root area within layer i, and T p is the potential transpiration [41]. Based in part on observed leaf area index (LAI) magnitudes, T p is calculated following the Jarvis-type approach presented in [28].
Following [14], T maxi is based on the approach of [42]: where w is the soil water matrix potential at saturation h, h i is the soil moisture in layer i, w c is the critical soil moisture potential at which plant wilting begins, r s is the soil resistivity to water flow into the roots, and r p is the internal plant resistivity to water flow. Soil resistivity (r s ) is typically modeled as: where a is a root geometry parameter with dimensions of length and K is the hydraulic conductivity of the soil. Bare soil evaporation was modeled as described in [28] except, following [32], an additional resistance term was added to the aerodynamic resistance in the TOPL-ATS bare soil evaporation (E) calculation (Eq. (15) in [28]) to parameterize aerodynamic resistance immediately above the bare soil surface. As in [28], the expression of [27] was used to model bare soil resistance to vapor flow: where h fc is the soil field capacity (at À0.1 bars), h 1 is surface (6 cm) soil moisture, and a and b are calibrated parameters that [27] recommend to be 3.8113 · 10 4 s m À1 and 13.515, respectively.

Model forcing data
Model forcing data (e.g. surface air temperature, relative humidity, wind speed, and air pressure) were derived from regional meteorological stations shown in Fig. 2. Interpolation of station observations was performed using r À2 weighting, where r is the distance between a given station and a given pixel. Within the Walnut Creek watershed sampling area, spatial precipitation maps were calculated using r À2 spatial interpolation of hourly observations from the Walnut Creek watershed rain gauge network (see Fig. 2). Rain gauge observations in the watershed are usually dense and likely lead to higher rainfall accuracies than is typically achievable in operational rainfall products within the United States. In areas inside of the modeling domain but outside of the watershed sampling area (see Figs. 1 and 2), 4-km Stage IV precipitation observations, derived from merging ground-based weather radar observations and relatively sparse rain gauge measurements, were used. A land cover classification for the region was derived from 30-m Thematic Mapper (TM) imagery acquired on May 14, July 1, and July 17, 2002. The classification was then aggregated from 30-to 90-m based on the most common land cover within each 90-m pixel. Road surfaces were neglected in this aggregation and no attempt was made to model them. However, due to artificial widening of roadways in the original 30-m classification, some 90-m pixels contained only surfaces classified as road. In these cases, a (non-road) land cover was randomly selected from one of the four cardinal directions and used to replace the road classification. High-resolution (30-m) NDVI maps of the areas were calculated using cloud-free TM over passes on June 23 and July 1, 2002. These maps were aggregated to 90 m and georeferenced to the TOPLATS modeling grid. Soil texture maps of the domain were acquired from the Iowa Soil Properties and Interpretations Database created by Iowa State University in cooperation with the USDA and Iowa Department of Agriculture and Land Stewardship. A topographic index map for the region [3] was derived from a 90-m USGS digital elevation map (DEM). Based on this DEM, the regional domain was subdivided into 113 separate watersheds with an average size of about 56 km 2 . Separate distributed TOPLATS simulations were run on a 90-grid within each watershed. Soil moisture imagery was then reconstructed by merging predictions from each watershed.

Model parameters
Like most land surface models, TOPLATS requires the specification of a large number of model parameters to run in a distributed manner. Fortunately, intensive ground-based sampling of vegetation, soil, and micrometeorology during SMEX02 provides much better guidance for parameter selection than would normally be available. Corn and soybean land cover constitute 47% and 38% of the model domain, respectively. Vegetation characteristics within these classes were very dynamic during the course of SMEX02. To capture this, corn and soybean leaf area index (LAI), plant height (h), and effective rooting depth (Z eff ) parameters used for TOPLATS simulations were varied on a weekly basis according to Fig. 3. LAI and h values were taken from vegetation sampling performed within the Walnut Creek Watershed during the course of the experiment. Values of Z eff , defined as the depth above which 80% of plant roots are found, were based on consideration of corn and soybean growth stage during the experiment and typical seasonal root development for both crops. Relative fractions of rooting area in each of the modelÕs four vertical layers were calculated by assuming an exponential decay of root area density with depth. Following [17], the root spacing parameter a in (5) was calculated using the empirical expression 0.0013/Z eff where both a and Z eff are in meters. Roughness lengths for momentum and heat transfer were assumed to h/10 and h/100 respectively, and zero plane displacement height (D) was set to 2h/3 [23,2]. Fractional vegetation covers (f v ) were derived using (2) and TM NDVI observations. Since typical internal plant resistances (r p ) for crops vary between 5.0 · 10 8 and 1.0 · 10 9 s [41], an averaged value of 7.5 · 10 8 s was used for corn and soybean.
Significant non-crop land cover types in the model domain include grass (9% of total area) and tree (4% of total area) cover types. Vegetation cover within non-crop areas was parameterized as being static in time 40  and having complete canopy coverage (f v = 1). Grass areas were parameterized the same as crop areas except that h was set to 0.50 m and Z eff to 0.20 m. Tree areas were modeled with an h of 2.5 m and a Z eff of 1.5 m.
To reflect physiological differences relative to grasses and crops, the r p of trees was raised to 1.5 · 10 9 s. Smaller amounts of open water and urban land cover types, corresponding to $2% of the regional domain, were modeled as unvegetated and impermeable. The albedo and emissivity of all modeled land surfaces were set to 0.20 and 0.96, respectively. Potential changes in total surface albedo due to canopy coverage variations were neglected since little site-specific data was available and the albedo range for corn and soybean cover is nearly identical to typically cited values for loamy soils (see albedos values reported in [5,35,30]). All soil parameters were derived from the soil textural classification map and texture-based lookup tables [8]. Based on comparison with ground-based LAI measurements during SMEX02, optimal values for NDVI max , NDVI min , and p in (2) were found to be 0.93, 0.037, and 0.606 respectively (M. Anderson, personal communication).
Human modification to the landscape has substantially altered the sub-surface hydrology of the SMEX02 region and requires careful consideration when parameterizing sub-surface flow within TOPLATS. Using TOP-MODEL concepts [3], TOPLATS predicts a local water table depth z to be: where STI is the local soil topographic index of [33] and the overbars signify averaging within a watershed. The parameter f controls the sensitivity of variations in z to topographic patterns. Areas of high STI are generally prone to surface saturation. However, within the SMEX02 site, widespread use of tile drains in high STI areas prevents them from becoming saturated. Within the TOPMODEL framework, low water tables can be maintained in high STI areas by specifying a high f value in (7). Since drainage tiles typically discharge directly into a surface drainage network, it is also necessary to elevate model predictions of baseflow to account for enhancements in lateral flow through the drainage system. TOPLATS predicts baseflow Q to be: where Q 0 is defined as the rate of baseflow at complete saturation but is often treated as a calibrated parameter. Based on these considerations, an f value of 9 and a Q 0 value of 0.012 m s À1 were used. Both values are larger than values typically assigned to agricultural basins. However, they produce a reasonable baseflow recession and surface runoff response to rainfall during the SMEX02 period (see Section 5).

Ground-based SMEX02 data
Watershed soil moisture sampling was performed at 31 field-scale sites in and around the 47 km 2 Walnut Creek watershed south of Ames, Iowa ( Fig. 2) during SMEX02. Sampling of these sites consisted of theta probe measurements at 14 separate sub-field locations on a stratified grid and was designed to estimate fieldscale (800 m) surface (0-6 cm) soil moisture means. An arbitrary field-scale of 800 m was chosen by SMEX02 campaign organizers based both on consideration of typical management unit sizes (i.e. patches of homogenous vegetation) in the area and the ground resolution of the airborne radiometers flown during SMEX02. Likewise, a 6-cm sampling depth was employed to approximate the measurement depth for remote L-band radiometers. To filter the impact of micro-topography, point-scale theta probe measurements at each of the 14 sub-field locations were actually based on the mean of three individual measurements across a single crop row. Watershed sampling was completed on 11 separate days during the SMEX02 period (June 25, 26, 27, and July 1, 5, 6,7,8,9,11,12). An alternative methodology, regional soil moisture sampling, was based on measurements at 47 sites on a stratified grid covering a much larger domain within central Iowa (Figs. 1 and 2). As in watershed sampling, surface (0-6 cm) soil moisture at each regional site was estimated from the mean of three theta probe measurements across a single crop row. 6  Regional sampling was designed to accurately estimate soil moisture statistics within an area roughly equivalent to two spaceborne AMSR-E footprints and completed on 16 separate days during SMEX02 (June 25, 26, 27, 29, 30, and July 1, 2, 4, 5, 6,7,8,9,10,11,12). Gravimetric soil moisture observations were taken at some regional and watershed sites and used to calibrate theta probe soil moisture observations. However, unless otherwise noted, all soil moisture results presented here are based on theta probe observations. Eddy correlation flux tower observations of land surface energy fluxes are also available at sites noted in Fig. 2 and 12 of the watershed sites were instrumented with fixed soil moisture sensors (Stevens-Vitel Hydra probes) at a depth of 5 cm.

Upscaling strategies
Based on 47 separate sampling points (Fig. 2), regional soil moisture sampling was designed to provide a single soil moisture mean at a given sampling time i for the entire domain in Fig. 1. These estimates can be used to evaluate various strategies for upscaling surface soil moisture. Domain-scale estimates made at sampling time i are referred to as h i . Watershed sampling retrieved spatial averages of soil moisture at finer field-scales ($800 m). Field-scale observations made at site j and sampling time i are referred to as h i,j . Model predictions are available at on a 90-m grid overlaying the entire domain. Predictions corresponding to sampling time i are referred to as h 0 i . Field-scale model predictions can be extracted by averaging 90-m grid cells that fall within the watershed fields (squares) shown in Fig. 2. Model predictions of spatially-averaged surface soil moisture conditions within site j for sampling time i will be referred to as h 0 i;j . The central purpose of this analysis is the estimation of h i based on model-derived predictions, h 0 i , and a limited number of site observations, h i,j . The most direct up-scaling strategy is the weighted averaging of observations taken at time i: where n sites is the number of sites where observations are available. Values for w j sum to unity and can be derived in the number of different ways. Simple spatial averaging dictates w j ¼ n À1 sites . More sophisticated approaches based on block-kriging or time stability analysis retrieve w j values based on the sampled auto-correlation of the soil moisture field or knowledge of time-invariant patterns. Ideally, observation sites will be distributed widely enough such that regional heterogeneity in vegetation, soil, and rainfall are adequately sampled. If not, sampling errors may be large even if sophisticated interpolation techniques are employed. Spatially-averaged model predictions can also be used to estimate h i : However, as noted in Section 1, model predictions of absolute soil moisture levels are sensitive to parameterization ambiguities and prone to bias. Rather then use observations or model predictions in isolation, a third upscaling approach is some combination of model predictions and field-scale observations. A simple form of this approach is to use comparisons between instantaneous model predictions, h 0 i;j , and observed, h i,j , soil moisture for a small number of sites to estimate (and eliminate) bias in domain-scale soil moisture predictions, h 0 i : Here ground-based observations are sampled and compared to local model predictions to estimate the domain-scale model bias, h i À h 0 i , and not, as in (9), the mean of the actual soil moisture field, h i . A possible rationale for using (11) instead of (9) is that the spatially distributed model bias field h i À h 0 i may exhibit less large-scale variability than the actual underlying soil moisture field. If ground-based observations of soil moisture are restricted to sparse locations, this will lead to reductions in sampling errors for domain-scale estimates. Likewise, a possible rationale for using (11) instead of the model-only strategy in (10) is that model predictions of relative soil moisture patterns may prove more robust to parameterization uncertainties than predictions of absolute soil moisture levels.

Results
Modeling results are based on 90-m TOPLATS simulations (described in Section 2) run over the entire modeling domain shown in Fig. 1 (Fig. 1) drained by these three streams. No attempt was made to route TOPLATS predictions or correct streamflow observations for human diversion, impoundment, or hydrologic routing. Nevertheless, TOPLATS is able to reproduce the baseflow recession between June 15 and July 4 and the volume of runoff in response to rainfall events between July 4 and 13 with reasonable precision. Evapotranspiration observations in Fig. 5 are from flux tower sites shown in Fig. 2. TOPLATS does a good job capturing mean evapotranspiration rates for the period but demonstrates less skill in capturing day-to-day variability. Fig. 6 shows comparisons between regional soil moisture, estimated from averaging all 47 regional-sampling observations at sites shown in Fig. 2 (h i ), and comparable TOPLATS results estimated from averaging of all 90-m pixels in the same regional domain (h 0 i ). Plotted error bars represent one-standard deviation (1r) variability in h i arising from random sampling and observation errors. Uncertainties were derived via application of the central limit theorem (i.e. dividing the sampled spatial variance of regional soil moisture observations by the number of observations) and are based on an assumption of unbiased sampling with sufficient spatial coverage to guarantee independent sampling errors. Rainfall observations within the domain are also plotted for reference. While results indicate a relatively low root-mean-square-error (RMSE) for regional-scale TOPLATS predictions (0.032 cm 3 cm À3 ), there is a general positive bias in model results--especially during late June (June 25-27) and mid-July (July 10-12) portions of the experiment.
Figs. 4-6 represent typical model evaluation plots that are often created in an attempt to validate spatially distributed land surface model predictions. Soil moisture observations during SMEX02, however, allow for a more intensive evaluation of distributed soil moisture predictions than is typically possible. For example, ground-based watershed soil moisture sampling provides daily estimates of field-scale soil moisture. Fig. 7 plots the scatterplot of observed versus modeled field-scale soil moisture for the entire SMEX02 period.  Intercomparing TOPLATS and ground-based observations at the field-scale, as opposed to the domain-scale as in Fig. 6, raises the RMSE of model results from 0.032 cm 3 cm À3 to 0.043 cm 3 cm À3 . However, the presence of statistically significant correlation in Fig. 7 suggests that the model is accurately representing at least a portion of the observed field-scale variability.
The remainder of this section uses the extensive soil moisture data set collected during SMEX02 to evaluate the potential value of distributed land surface model predictions for the estimation of footprint-scale soil moisture means required for validation of spaceborne soil moisture retrievals. Specifically, Section 5.1 assesses to what degree multi-scale soil moisture heterogeneity observed during SMEX02 is accurately captured by TOPLATS simulations. Then, using strategies introduced in Section 4, Section 5.2 describes upscaling results based on the merger of field-scale model observations with distributed TOPLATS predictions.

TOPLATS representation of soil moisture heterogeneity
Here three diagnostic statistics are used to evaluate the quality of spatially distributed TOPLATS soil moisture predictions: multi-scale spatial standard deviations, Spearman rank coefficients, and semivariograms. Spatial standard deviation and Spearman rank results reflect model skill in predicting lumped spatial statistics and relative spatial patterns, respectively. Semivariogram results demonstrate key changes in spatial sampling prospects that arise from the integration of model results.
The sampled spatial variance of a soil moisture field varies as a function of the support scale for measure-ments and the extent scale for the calculation of statistics [40]. Measurement support is defined as the spatial scale over which a given measurement or estimate integrates spatial information. Spatial supports can be increased through the aggregation of fields and estimation of means at coarser spatial scales. The extent scale is the spatial scale over which measurements are sampled to obtain spatial statistics. Specification of both scales is critical for efforts to precisely define spatial statistics for any geophysical field. Fig. 8 contains comparisons of TOPLATS modeled and observed soil moisture spatial standard deviations at three different support and extent scale combinations. Observation-based results in Fig. 8a (with point-scale support and field-scale extent) are derived from estimating the variance of all 14 theta probe measurements taken within each watershed field (Section 3) and linearly averaging the sub-field-scale variance for each field across all 31 fields in the watershed. Model results are based on a similar methodology applied to all 90-m TOPLATS pixels within each watershed field. Observed field-scale soil moisture estimates (required for the calculation of Fig. 8b) are calculated through simple averaging of all 14 sub-field theta probe observations within each field. A sampled standard deviation is then calculated using estimated field-scale averages for all 31 watershed fields. The impact of sampling uncertainty within estimated field-scale means is approximated using the central limit theorem, and used to correct the standard deviation of field-scale means for the sampling error in the means themselves. Standard deviation estimates in Fig. 8c are based on all 47 point-scale observations made at regional sampling sites (Section 3 and Fig. 2) and all 90-m TOPLATS pixels in the regional model domain. In general, TOPLATS predictions of soil moisture variability are consistent with variability measured by ground-based observations. Notable exceptions include the modelÕs underestimation of field-scale soil heterogeneity after July 10 in Fig. 8b and overestimation of point-scale variability within the entire regional domain (Fig. 8c) prior to rainfall on July 4. The sharp increase in both modeled and observed field-scale variability in Fig. 8b on July 5 is due to spatially heterogeneous rainfall on that date. Much of this heterogeneity is subsequently eliminated by more spatially homogeneous rainfall on July 10. Limitations in the extent and spacing of SMEX02 ground observations prevent comparisons at scale combinations other than those plotted in Fig. 8.
More critical to the goal of upscaling soil moisture than recovery of simple spatial statistics is evidence that models can accurately capture relative spatial patterns in soil moisture fields. Spearman rank correlation coefficients (S R ) describe the strength of correlation between the relative rankings of two random variables. To calculate these coefficients, all 31 watershed fields are ranked according to their moisture content. Separate rankings are constructed based on both actual ground observations and TOPLATS soil moisture predictions. Spearman coefficients (S R ) are then calculated as: where d j is the difference in rank for a given field j when ranking is performed using observations versus predictions. The magnitude of these coefficients reflect the degree to which model results are capable of reproducing the relative ranking of fields according to observed soil wetness. Fig. 9 contains a time series of S R values between field-scale model output and observations for all of the watershed sampling fields during the SMEX02 period. Rank correlation coefficients calculated during individual days during SMEX02 are statistically significant--at a 1r level--during all watershed sampling days and significant at a 2r level for nine out of 11 days. The highest rank coefficient is found on the day (July 5) exhibiting the largest field-scale variability due to locally heterogeneous rainfall (see July 5 in Fig. 8b). Relatively lower rank coefficients on June 25-27 indicate that TOPLATS results have less skill is reproducing relative spatial patterns on these days despite its success in reproducing observed spatial statistics (see June 25-27 in Fig. 8b).
Results in Fig. 9 demonstrate that TOPLATS exhibits a statistically significant level of skill in accurately identifying relatively wet and dry areas with in the Walnut Creek watershed area. This implies, but does not guarantee, that subtracting TOPLATS predictions from the actual soil moisture fields filters underlying spatial variability. A better test is the direct intercomparison of semivariograms for both the soil moisture observation, h i,j , and model/observation difference, h i;j À h 0 i;j , fields. Fig. 10 shows semivariograms for spatial perturbations within both fields using field-scale model predictions and observations for all 31 watershed sampling sites. No spatial statistical inhomogeneity or anisotrophy is detectable in either field. However, variations in spatial rainfall patterns introduce large temporal variability in calculated semivariograms. To ensure temporal stationarity, semivariograms shown in Fig. 10 are calculated separately for three discrete time periods (June 25-27, July 5-6 and July 7-8) that fall between major rainfall events. Relative to the underlying soil moisture observations, model/observed differences in July are less spatially variable (i.e. have a lower semivariogram sill) and appear to demonstrate a smaller correlation length. This suggests that the subtraction of the modeled field h 0 i;j from the observed field h i,j filters largescale spatial variability in the soil moisture field. Removing this variability leads to fields in which spatially sparse or clumped observations can be upscaled to coarser spatial scales with smaller sampling errors. It is this ability to filter soil moisture variability which forms the basis of the model-based upscaling procedure described in Section 4. Earlier in the SMEX02 period, however, evidence for model-based improvement is  weaker. Due to low levels of background soil moisture variability, the impact of subtracting model predictions is much smaller during the June 25-27 period (top panel Fig. 10).

Field-scale upscaling results
Results in Section 5.1 suggest that TOPLATS simulations can reproduce a significant portion of the spatial and temporal heterogeneity found in SMEX02 soil moisture observations. One potential application of this apparent skill is the development of model-based upscaling strategies for soil moisture fields. Fig. 11 contains intercomparisons between the three techniques presented in Section 4 for estimating mean soil moisture within the modeling domain shown in Fig. 1. The first approach, the observation-alone methodology presented in (9), is based on the simple averaging of field-scale soil moisture observations. The second model-alone approach in (10) is based on the averaging of all 90-m model predictions within the entire regional domain. The third combined approach in (11) integrates the first two approaches by using observations to correct model bias. As in Fig. 6, benchmark regional-scale soil moisture values h i are obtained by averaging daily observations from all 47 regional soil moisture sites, and used to evaluate the accuracy of upscaling methodologies.
As a first test case, upscaling procedures are applied assuming ground-based observations are limited to a single watershed field j (n sites = 1). For day i, this assumption simplifies (9) to and (11) to where h i,j and h 0 i;j are the observed and modeled soil moisture fields, respectively, and h 0 i is the regional-scale mean for modeled soil moisture. RMSE values for the observation-alone and combined approaches are pooled values obtained from applying this single-field methodology to all fields j. Using a single field-scale observation and the observation-only approach to estimate the regional-scale soil moisture leads to a daily RMSE for regional-scale soil moisture estimates ranging between 0.028 cm 3 cm À3 on June 27 to 0.087 cm 3 cm À3 on July 5 with a pooled value for all days of 0.057 cm 3 cm À3 .    Fig. 11. Normalized RMSE in domain-scale soil moisture estimates when applying the model-only methodology described in (10) and the combined model/observation methodology described in (11). RMSE values are normalized by errors associated with using observation-only approach in (9) and displayed for the case of (a) upscaling individual fields independently and (b) upscaling the average of all 31 watershed fields in unison.
Normalized RMSE plotted in Fig. 11a are the ratio of the RMSE for regional-scale soil moisture estimates obtained using a model-based approach and the RMSE obtained when applying the observation-only approach. All error quantities are calculated separately on a daily basis. A normalized value less than one means that information contained in modeled soil moisture fields lowered the RMSE of regional-scale soil moisture estimates on a given day. In Fig. 11a, normalized values for the purely model-based approach (10) exhibit a great deal of temporal variability. Reliance on unconstrained model predictions yields improved results (i.e. normalized error less than one) between July 2 and July 10, but is less reliable all other days. In contrast, the combined method (11) is more stable and yields improved estimates, relative to observation-only results, on all days except June 27. The positive impact of integrating model results is also evident in pooled error statistics (i.e. RMSE, bias, and correlation coefficients) listed in Table 1 for both the observation-only and combined observation/model upscaling approaches.
Poor normalized error results for the combined and model-only approaches on June 27 in Fig. 11a are due both to a low RMSE for the observation-only approach on that day (used as a normalizing factor) and relatively poor model performance during the tail end of a gradual dry-down in late June (Fig. 6). The success of the observation-only approach is tied to a lack of field-scale soil moisture variability during this period (top panel in Fig. 10) which eases the severity of sampling problems typically associated with observation-only upscaling. Dry conditions during late June/early July also lead to relatively poor model estimates of soil moisture and may degrade the accuracy of model-based upscaling procedures. To examine this issue in detail, results in Fig. 11 can be plotted as normalized error versus mean soil moisture to isolate any relationship between the accuracy of model-based upscaling results and mean soil moisture (not shown). Since the spatial structure of both observed and modeled soil moisture is known to change with mean soil moisture [29,12], the accuracy of model-and ob-served-upscaling strategies may exhibit a similar dependence--particularly if model results are prone to bias in dry conditions. However, when limited to the 11 sampling days available during SMEX02, no statistically significant relationship between model-based upscaling errors and mean soil moisture could be identified.
Future ground-based soil moisture networks may allow for observations at multiple field-scale sites. Therefore, as a second test, upscaling strategies are applied assuming the availability of field-scale observations at all the Walnut Creek watershed sites in Fig. 2. For this case, regional-scale estimates for the observation-only and combined upscaling approaches are calculated by inserting w j ¼ n À1 sites and n sites = 31 into (9) and (11). Estimating regional-scale soil moisture using simple linear averaging of all available field-scale observations reduces the pooled RMSE for observation-only results to 0.028 cm 3 cm À3 (Table 1). Fig. 11b is analogous to Fig. 11a except that (9) and (11) upscale using the simple linear average of observations and model/observation differences from all 31 watershed fields as opposed to just a single field. Absolute errors for the model-only and combined approach are normalized by daily RMSE associated with applying the observation-only approach assuming the availability of data from all 31 watershed fields. As in Fig. 11a, the purely model-based approach (10) demonstrates very low normalized error on some days but is unreliable during other days. The combined approach in (11) is more robust to day-to-day variations in hydrologic conditions and consistently improves the accuracy of regional-scale soil moisture means. Pooled error results for the observation-only and combined cases are listed in Table 1 for comparison. A key consideration for model-based approaches is their sensitivity to model parameter uncertainty. Like many land surface models, TOPLATS suffers from a complex parameterization which requires a large number of specified parameters (see Section 2.2). To assess the impact of parameter uncertainty on key results, five key model parameters: surface albedo, saturated soil hydraulic conductivity, the Brooks-Corey soil pore size distribution index, the p parameter in (2) used to predict fractional vegetation coverage, and the b parameter in (6) used to predict soil resistance to evaporation were individually multiplied by factors of both 0.80 and 1.20. Model-based upscaling results were then recalculated using these 10 perturbed parameter sets (5 parameters times 2 perturbation types). Plotted ranges in Fig.  12 indicate the absolute spread of normalized error associated with upscaling field-scale observations for all 10 TOPLATS simulations. Even modest amounts of parameter uncertainty (i.e. 20%) is capable of significantly impacting the value of the model-only approach (10) especially during dry parts of the simulations. In contrast, results for the combined approach (11) demonstrate less sensitivity to parameter variations. Table 1 Pooled error statistics (RMSE, bias, and correlation coefficient R 2 ) for regional-scale soil moisture estimates derived using the observationonly (9) and combined observation/model (11)

Point-scale upscaling results
The primary focus of soil moisture sampling during SMEX02 was the estimation of field-and regional-scale soil moisture means for the validation of airborne and spaceborne soil moisture products. However, as noted in Section 3, some fixed point-scale soil moisture observations are also available. Using time series data from fixed soil moisture sensors active during SMEX02, upscaling results in Section 5.2 can be repeated for the case of point-scale (rather than field-scale) ground-based observations. The upscaling procedure used is identical to that employed for field-scale observations in Fig. 11 except h i,j in (9) and (11) now refers to point-scale (as opposed to field-scale) observations and h 0 i;j refers to the single 90-m TOPLATS pixel whose center is closest to the point-scale observation (rather than the average of all 90-m TOPLATS pixels in a given field). Calculations of model-only predictions via (10) do no change. Assuming the availability of a single point-scale observation, replotting Fig. 11a using this point-scale methodology (not shown) reveals that the combined approach (11) improves regional-scale soil moisture predictions, relative to observation-only results, on only six of 13 days--as compared to 10 of 11 days for upscaling field-scale observations. Table 2 contains error statistics for regional-scale soil moisture means calculated based on point-scale observations. Unlike field-scale results in Table 1, there is no indication that TOPLATS modeling improves the upscaling of a single point-scale soil moisture observation. However, incorporating model results does lead to a small improvement when upscaling the mean of all twelve fixed point-scale measurements ( Table 2).

Discussion and conclusions
The spatial scaling properties of surface soil moisture are linked to both temporal dry-down dynamics [29,31,12] and spatially variable land surface properties (e.g. soil texture [9] and topography [39,6]). By synthesizing the appropriate distributed forcings (e.g. rainfall, digital elevation models, and soil texture maps) soil moisture predictions from a distributed land surface model represent a best guess as to the impact of these processes on the dynamic evolution of sub-footprintscale (<30 km) soil moisture patterns. Sufficiently accurate knowledge of these patterns allows for the linking of local-scale soil moisture observations to footprintscale soil moisture means. This analysis is aimed at evaluating the potential for a model-based upscaling approach using soil moisture observations obtained during the SMEX02 field experiment. Results in Figs. 6-10 demonstrate the ability of distributed TOPLATS simulations to accurately reflect space/time patterns of variability observed in ground-based soil moisture sampling during SMEX02. Here this skill is exploited to upscale ground-based observations from the field-scale to the regional modeling domain shown in Fig. 1. During SMEX02, the combined model/observation upscaling approach (11) is demonstrated to be superior to simple averaging of ground-based observations using (9) for all but one day ( Fig. 11 and Table 1) and more robust to parameter certainty and day-to-day variability in hydrologic conditions than using pure model predictions and (10) to estimate regional-scale soil moisture means (Fig. 12). Improvements in model-based upscaling results stem from statistical differences between the model bias field h À h 0 and the underlying soil moisture field h. Subtracting model results from field-scale observations removes spatially correlated variability in the soil moisture field (Fig. 10). As a result, large-scale soil moisture estimates are more accurate when derived from sparse  Table 2 Pooled error statistics (RMSE, bias, and correlation coefficient R 2 ) for regional-scale soil moisture estimates derived using the observationonly (9) and combined observation/model (11)  spatial sampling of the model bias field versus the original soil moisture field. Taken as a whole, results suggest that model-based upscaling procedures can improve the estimation of large-scale soil moisture means required for the validation of footprint-scale soil moisture retrievals from spaceborne radiometers. The model-based upscaling procedure discussed in Section 5.2 is based on point-scale ground-based observations that have been spatially averaged up to the field-scale. Such field-scale estimates are not currently available outside of intensive field campaigns since operational soil moisture networks typically consist of singlepoint observations separated by 10Õs of kilometers. For this upscaling strategy to be of immediate value requires that it be validated for point-scale observations. Unfortunately, analysis of point-scale data during SMEX02 gives no indication that the procedure can be applied successfully to point-scale data (Section 5.3 and Table  2). The most likely explanation for this is that, relative to coarser field-scale heterogeneity, TOPLATS predictions have less skill in predicting the spatial structure of sub-field-scale variability that impacts point-scale observations. Consequently, the benefits of model-based upscaling emerge only when fine-scale variability is filtered by averaging a sufficient sample of point-scale observations up to a field-scale. Sufficiently dense soil moisture probe networks within a single field are technically possible but may not prove practical. A potential alternative to maintaining dense sampling networks within individual fields is to rely on time stability strategies capable of linking point-scale observations to fieldscale means (see e.g. [22]). A combination of a time-stability approach to upscale from the point-to field-scale and then a model-based approach to upscale from the field-to footprint-scale may be optimal as it would allow each approach to focus on the spatial scales at which they are the most effective.
Several additional points are worth noting when considering results presented here. First, all domain-scale soil moisture estimates presented are based on simple linear averaging of field-scale observations and/or field-scale differences between modeled and observed soil moisture. Block-kriging methodologies provide a more sophisticated approach to aggregating these observations based on observed auto-correlation structure. However, given the spatial shortcomings of current ground-based systems (i.e. sparse sampling patterns and limited spatial extents) it may be difficult to accurately obtain such correlation information within many footprints. Here, ground-based soil moisture information was assumed to be limited to a series of field-scale observations clustered in and around the Walnut Creek watershed. This precluded the ability to estimate the large-scale (>25 km) correlation information required to effectively implement kriging strategies.
In addition, hydrologic aspects of the SMEX02 site may make it a poor case study for non-agricultural watersheds. Accurate parameterizating of topographically-driven lateral flows is often cited as a key element in predicting soil moisture spatial patterns at local hillslope scales (10-500 m) [14,43]. However, in the SMEX02 region the extensive use of tile drains to maximize land area available for cultivation has significantly reduced the natural lateral redistribution of water. Extensive artificial drainage is not uncommon in agricultural watersheds and can be accommodated by modifications to TOPMODEL calibration parameters. Nevertheless, it is unclear how relevant results derived here are to landscapes where lateral hydrologic flows are unimpeded.
Finally, a persistent problem for any analysis of soil moisture heterogeneity has been that sufficiently intensive soil moisture observations are typically limited to small time and space windows. Limited temporal coverage is clearly a weak point of this analysis. Groundbased sampling associated with the validation of current (AMSR-E) and future spaceborne soil moisture missions (e.g. the NASA Hydrospheric States Mission) will likely provide continued opportunities to evaluate and refine distributed surface soil moisture predictions from land surface models. Recently acquired soil moisture data from the Soil Moisture Experiment in 2003 (SMEX03) within regional domains in Oklahoma, Alabama, Georgia, and Brazil provides an immediate goal for future research.