To improve model soil moisture estimation in arid/semi-arid region using in situ and remote sensing information

Soil moisture plays a key role in water and energy exchange in the land hydrologic process. Effective soil moisture information can be used for many applications in weather and hydrological forecasting, water resources, and irrigation system management and planning. However, to accurate modeling of soil moisture variation in the soil layer is still very challenging. In this study, in situ and remote sensing information of near-surface soil moisture is assimilated into the Noah land surface model (LSM) to estimate deep-layer soil moisture variation. The sequential Monte Carlo-Particle Filter technique, being well known for capability of modeling high nonlinear and non-Gaussian processes, is applied to assimilate surface soil moisture measurement to the deep layers. The experiments were carried out over several locations over the semi-arid region of the US. Comparing with in situ observations, the assimilation runs show much improved from the control (non-assimilation) runs for estimating both soil moisture and temperature at 5-, 20-, and 50-cm soil depths in the Noah LSM.


Introduction
Soil moisture is a key element in land surface hydrologic process, and it plays a vital role in water and energy cycles.Providing accurate soil moisture is essential for improving mathematical modeling for weather and hydrological forecasting, climate prediction, water resource and irrigation management/scheduling, and agriculture product estimates (Narasimhan and Srinivasan 2005;Walker and Houser 2001;Beljaars et al. 1996;Drusch 2007;Mahfouf 2010;Dirmeyer 2000;Koster and Suarez 2003;Rosenzweng et al. 2002).Field-based soil moisture measurements are not available for most of practice.Remote sensing of soil moistures from active or passive microwave data are becoming available but are with uncertainty and limited to provide top layer soil wetness.Modeled soil moisture can get gridded values and reach to deep soil layers.However, previous studied indicated that current land surface models (LSMs) have deficiencies to accurately model the soil moisture variation.A promising way is to assimilate remote sensing and observation moisture into LSM to improve model accuracy.In this study, we have evaluated the top layer soil moisture estimation from the Advanced Microwave Scanning Radiometer-Earth Observing System (AMSR-E) on the NASA EOS Aqua satellite and apply them to retrieve deep-layer soil moisture as well as other fluxes using the Noah land surface hydrologic model and advanced data assimilation techniques.
Data assimilation is an analytic method for merging uncertain model predictions with imperfect observational data in a way that is consistent with a model system's physical descriptions and permits better estimates and reduced uncertainty (Liu and Gupta 2007;Reichle et al. 2008).Recently, data assimilation methods have been used to integrate ground-based, airborne, and especially satellite observations of near-surface soil moisture or brightness temperature into LSMs.For example, Walker et al. (2001) have assimilated remotely sensed near-surface soil moisture into a LSM using the Kalman Filter (KF).Reichle et al. (2007) had assimilated surface soil moisture retrieved from AMSR-E into Catchment land surface model (CLSM); they found that soil moisture estimated from data assimilation is better than that retrieved from satellites and from model control runs.
Several studies have evaluated data assimilation methods.Sabater et al. (2007) assimilated surface soil moisture from the Surface Monitoring of the Soil Reservoir Experiment (SMOSREX) into the Interaction between Soil, Biosphere, and Atmosphere Scheme (ISBA) LSM to investigate root zone soil moisture.They used four assimilation methods, including the variational methods, KF, extended KF (EKF), and ensemble KF (EnKF) and suggested that 1D-VAR is the best and that EnKF is a ''promising technique.'' Reichle et al. (2008) have used adaptive EnKF to assimilate soil moisture into CLSM, and suggested that the Adaptive EnKF method can generally identify model and observation error variances and improve assimilation estimates when compared with EnKF output.Besides the observation errors (e.g., remote sensing-retrieved soil moisture), Reichle et al. (2004) and Koster et al. (2009) found systematic differences in soil moisture between the observations and LSM outputs.Koster et al. (2009) also found that these systematic differences vary depending on many factors, such as satellite sensors, retrieval algorithms, and LSMs, which have posed challenges for combining these datasets.
Both KF and EnKF assume that all probability distributions involved are Gaussian, whereas most physics models are non-linear and non-Gaussian.When a nonlinear relationship exists between a state and observed data, EnKF provides less effective simulation (Jardak et al. 2010).However, Sequential Monte Carlo-Particle Filter (SMC-PF) can be more effective than KF and EnKF in modeling highly nonlinear and non-Gaussian (Arulampalam et al. 2002;Doucet et al. 2000).This study explores the use of PF for soil moisture data assimilation.Top layer soil measurements from remote sensors are used in the data assimilation to improve estimation of soil moisture and temperature of deeper layers.
The scope of this study is described as follow: ''Develop data assimilation approach in land surface hydrologic modeling'' section describes the LSM and data assimilation techniques used.It covers a brief discussion of Noah land surface process model, presenting the Bayesian SMC-PF approach for data assimilation.''Study area and data for the experiment'' section describes the data used, including in situ measurement and remote sensing data, in the case study.Case studies were implemented for specific testing sites in California and in ARS Walnut Gulch watershed, where ground soil moisture profile measurements are available for validation.Finally, discussion and conclusions are given in ''Case study'' section.
Develop data assimilation approach in land surface hydrologic modeling Soil moisture estimation from Noah LSM In this study, Noah LSM is used as the physical model.There are practical reasons for using Noah LSM model for this study.For example, Noah is broadly used either in standalone or in coupled weather and climate model (e.g., WRF and GFS); further many land data assimilation systems have Noah LSM as physical model also (e.g., NLDAS, GLDAS, LIS, and HRLDAS).
In our experiment, Noah LSM is used for DA experiments to estimate soil moisture of testing sites.The Noah LSM combines the diurnally dependent Penman potential evaporation approach, the multilayer (4-layer) soil model, and the modestly complex canopy model (Chen et al. 1997).In the canopy model, Noah prescribes the vegetation indices (e.g., greenness coverage fraction, vegetation types, roughness, and albedo) while modeling canopy conductance as a function of soil moisture availability, solar radiation, air temperature, and humidity (Chen and Dudhia 2001).Noah estimates soil temperature using the thermal diffusion equation and parameterizes thermal conductivity based on Peters-Lidard et al. (1998).Noah obtains the surface temperature by resolving the energy balance equation, and calculates surface flux exchange coefficients using similarity theory-based stability functions (Chen et al. 1997).Soil moisture is estimated using Richardson's equation, while the surface runoff and infiltration methods are based on Shaake et al. (1996).Details about the Noah LSM can be found in Chen and Dudhia (2001) and Ek et al. (2003).Here, we focus on certain aspects that are most relevant to soil moisture estimation.The Noah LSM estimates top soil layer as: where s is volumetric soil moisture content, d z 1 is the topsoil layer thickness, P d is the precipitation not intercepted by canopy (including condensation), E t 1 is the canopy transpiration taken by the canopy root in the top layer, E dir is the direction of evaporation from ground surface, D is soil water diffusivity, and K z 1 is soil water hydraulic conductivity from the top layer to the second soil layer.At the bottom soil layer, the equation is very similar to Eq. 1 but, first, without the last four terms on the right side, second adding soil water hydraulic conductivity from the third layer to the bottom layer, and finally, letting the soil water hydraulic conductivity from this layer become the base flow.

Sequential Monte Carlo-Particle Filter
Both KF and EnKF assume that all probability distributions involved are Gaussian, while most physical models are nonlinear and non-Gaussian.Second, when a nonlinear relationship exists between a state and observed data, EnKF is not entirely effective.In such contexts, particle filters may have better advantages than KF and EnKF (Jardak et al. 2010).A general dynamic system, like Eq. 2, can be represented by the state innovation and measurement process, as expressed below: Dynamic state equation : where x t is an n-dimensional vector that consists of the system's state variables at a particular time t; f(Á) and h(Á) are nonlinear state and measurement functions; and v t and w t are process and measurement noise, respectively.DA makes available a set of discrete observations Z t ¼ ½z 0 1 ; z 0 2 ; . ..; z 0 tÀ1 ; z 0 t at time t and the preceding time steps, in which z 0 t is an m-dimensional vector formed by the variables measured in the system.
To adopt the dynamic process in Eq. 3, we could assume that soil moisture content and soil temperature are the state variables, while the process measurements are available through limited satellite measurements of surface temperature and soil moisture, along with in situ observations of soil moisture and ground water levels.Stochastic assimilation seeks the conditional probability density function (pdf), pðx t jZ t Þ, that describes the model state's probability distribution, which is associated with all the observations Z t .The Sequential Bayesian Filter is a stochastic approach to obtain the ''posterior'' pdf, pðx t jZ t Þ, for a state vector x t of a system at a particular time t.In the prediction stage, assuming the last measurement z 0 t is not yet available, the conditional pdf of x t is calculated as: The posterior pdf is obtained by updating the prior pdf using the measurement z 0 t via Bayes' rule: SMC-PF methods are capable of providing posterior probability distributions of variables even for highly nonlinear models with a non-Gaussian error structure.The posterior distribution is revised from the initial (given) pdf at each time step by the likelihood function, which is calculated from the measurements z 0 t (t = 1, 2,…,t) and gives the better-predicted x t values higher weights (probability).The Monte Carlo (MC) approach is a numerical method that solves the pdf values at discrete points in the system's state-space.In the discrete format x t ¼ x i t , where i is the sample index, N s is the sample size, and g i tjt ¼ p x i t jZ t À Á , Eqs. 4 and 5 become: During simulation PF may be characterized by significant degeneracy and requires resampling to redistribute the existing samples.Sequential importance resampling (SIR) removes samples with low importance weights (low probability) and assigns more samples to those of high importance weights (high probability).The new samples (g iÃ t ) have uniform weights (1/N), which, with a number of repeated samples, are proportional to the importance weights g i t (Liu and Chen 1998;Arulampalam et al. 2002).A criterion can be provided to evaluate PF filter degeneracy based on the effective sample size (Arulampalam et al. 2002;Doucet et al. 2000).As the resampling process only redistributes samples in the existing points, which lose diversity among the particles, after several resamplings, redistribution of particles is needed.A regularization step can be used to further diversify the existing particles.
PF filters have been shown to be very flexible for assimilating data in numerical model predictions (Doucet et al. 2000;Moradkhani et al. 2005;Kalnay 2003;Weets and El Serafy 2006;Hsu 2011).PFs have been applied to hydrologic simulation and found to be very useful for estimating uncertainty in state variables.Weets and El Serafy (2006) have shown that the SMC-PF with residual resampling (RR) outperforms EnKF when the sample size increases.In applications to nonlinear distributed modeling, a large number of state variables were estimated.Several techniques have been reported as providing effective ensemble prediction of ocean and atmospheric models.In this study, PF assimilation is applied to Noah LSM for soil moisture estimation.

Land surface hydrologic model and data assimilation
Accurate modeling of soil moisture is critical to our understanding soil-vegetation-atmospheric interactions, hydrology, and prediction of water availability.Due to the complexity associated with soil physics and to uncertainties in data and parameterization schemes, LSM simulations of soil moisture show marked deviations from observations, especially in semi-arid regions.In Noah LSM, 19 state variables are included in the simulation.The state variables and parameters include: h i s ði ¼ 1; 4Þ is the total soil moisture at each layer, h i l ði ¼ 1; 4Þ is the liquid water content at each layer, T i ðk ¼ 1; 4Þ is soil temperature at each node, SWE is snow snowpack snow water equivalent (SWE), SNODH is snowpack depth, C h , C m are surface exchange coefficients for heat (moisture) and momentum.CMC is canopy moisture content.T 1 is ground/snowpack/canopy effective temperature.Skin temperature (T 1 ) and topsoil moisture (h i l ) can be obtained from remote sensing data and they are assimilated into the Noah LSM.
We can conduct multi-sensor and multi-scale data assimilation in the Noah LSM.The simulation experiments are set to grid points where surface soil moisture measurements from AMSR-E of Aqua satellite are available.

Ground data and validation sites
Two case studies were included in this study.The first case is to test the concept using in situ measurement, at two Natural Resources Conservation Service (NRCS) gauge points in California, while the second case extends from in situ measurement to using remote sensing data.Gauge point from US Department of Agriculture (USDA) Agriculture Research Service (ARS) is used.Study area and selected gauge sites are shown in Fig. 1.

The NRCS test sites
In some NRCS gauge sites, besides, precipitation and SWE, soil moisture and temperature at 5, 20, and 50 cm are measured daily.Two gauge sites with continuous measurement of daily and hourly data for SWE and precipitation since 1980s and soil moisture data since year 2000 are selected for this experiment: they are (1) gauge ID#518 at 38.917°N, 119.9167°W, and elevation 8,582feet high and (2) ID#697 at 38.5°N, 119.633°W, and elevation 7,736-feet high.

The USDA ARS test site
The Walnut Gulch Experimental Watershed of USDA ARS is one of the most intensively instrumented semi-arid experimental watersheds in the world (Moran et al. 2008Garcia et al. 2008;Goodrich et al. 2000;Kustats and Goodrich 1994).The extensive hydro-meteorological instrumentation covering the WGEW dates primarily from the early 1960s.One gauge site (Lucky Hill; 31.735°N,110.052°W; elevation 4,494-feet high) is selected for this experiment.

Satellite data
Surface soil moisture and surface temperature data generated from AMSR-E observation at daily and 0.25°are used in this study.The data set is provided by Owe et al. (2008).The retrieved method for this dataset uses a forward modeling optimization procedure to solve a radiative transfer equation for both soil moisture and vegetation optical depth (Owe et al. 2008).

Forcing data
The proposed areas include test sites where meteorological fields (including precipitation, solar radiation, surface pressure, temperature, humidity, and wind) have been observed since 1990.For the other grid points without data, North American Land Data Assimilation System (NLDAS) forcing data were used.

Case study
Case study based on in situ observation: NRCS test site Data from two NRCS sites are collected and evaluated.They are discussed below.Figure 2 shows assimilation of 5-cm soil moisture from the Noah LSM using SMC-PF with SIR resampling strategy at one NRCS gauge site, gauge ID 697 at (38.505°N, 119.626°W) and 7,736-feet elevation.A control run includes forcing data and default settings for the simulation without using available top layer soil moisture observation.The control run (black line) fits well to the wet period of observation (red line), but underestimated soil moisture during the dry period (see green circles in the figure).SMC-PF (blue lines) gives improved estimation than that of control run at 5-and 20-cm depths.For the timer period of soil moisture with high variability, being highlighted with green circles, SMC-PF simulation fits very well to observations.At 50-cm depth, the SMC-PF estimation shows improvement from control run for most of the time, but could not catch soil moisture with high variability from observations (see highlighted green circle).Because of water and energy exchange near the surface layers are more active, the variability of soil moisture at upper layers usually is much higher than the lower layers.The reason why there is a significant variability at the lower layer (see green circle at 50-cm layer) than upper layers is not clear.
Figure 3 shows the observed and simulated soil moisture estimation at another NRCS gauge site (ID 518) located at a 38.917°N and 119.9167°W.Comparing with observation (red line), the default control run (black lines) is underestimated soil moisture significantly at all evaluation layers (5, 20, and 50 cm).SMC-PF simulation gives very good estimation of soil moisture for all layers.
Soil temperature of the test site mentioned above (ID 518) is plotted in Fig. 4. Clearly, the control run (black lines) overestimated soil temperature at all test layers.SMC assimilation has improved both soil temperature (Fig. 4) and soil moisture estimation (Fig. 3) from control run substantially.
Three statistics were calculated to evaluate the estimation soil moisture before and after assimilation.They are mean value, root mean square error (RMSE, and bias estimates of soil moistures (m 3 m -3 ).Table 1 shows the evaluation statistics based on experiment of two NRCS sites.It shows that, after model with assimilation of top layer surface soil moisture, the estimated soil moisture of all layers are improved from the model control runs significantly.Figure 5a shows the soil moisture at 5-cm layer from observed, remote sensing AMSR-E retrieval, Noah control run, and AMSR-E assimilated.Comparing to ground measurement (red line), AMRR-E soil moisture retrieval (green line) provides reasonably well for high values but underestimated the low moisture content.Control run (black line) without assimilation of top layer soil moisture information, however, has largely overestimated the amount.Assimilation using AMSR-E soil moisture product, on the other hand, provides a better estimation than that of control run.The assimilated estimates are plotted in between the control run estimates and remote sensing observation (see blue line).Although assimilation using SMC-PF overestimate high soil moisture contents, it provides improved estimates from control run.
Figure 5b-d shows the observation and model estimates of soil moisture at 20-, 50-, and 100-cm depths, respectively.Overall, assimilate top layer AMSR-E soil moisture measurement has consistently improved soil moisture estimation of low layers from no assimilation control run.For a close look, the gauge observed soil moisture for soil depth at and below 50 cm is becoming stable (see Fig. 5c,  d).This implies that the impact of surface forcing (precipitation) to the soil moisture is less sensitive at the depth of 50 cm and below, unless heavy storm events occur.Input flow from surface can be evaporated to atmosphere before reaching to the deep soil layers.Comparing to gauge soil moisture observation, Noah model estimates are highly variable at 50-cm layer and finally show stable (flat) at 100-cm layer.Although adding soil moisture assimilation improves model soil moisture estimation, the behavior related model structure and parameter settings require further investigation.
An evaluation summary is listed in Table 2. Comparing with site-observations, the modeled results with remote sensing data assimilation show significant improvement in mean, RMSE, and Bias for all layers (5, 20, 50, 100 cm)   Figure 7 displays the comparison of spatial distribution soil moisture from AMSR-E, Noah model control run, and assimilated SMC-PF run, over the time period of Aug 14-17, 2005.AMSR-E soil moisture estimates show good agreement with observed soil moisture on the test site.Control run generate higher than observed soil moisture at test point.Spatial distribution of soil moisture from three estimations is similar, but remotely sensed (AMSR-E) estimation gives driest value at the top layer.Not that the top layer for Noah model is defined at 5-cm depth, while remote sensing estimation from AMSR-R sensors are relevant to soil moisture on the top surface layer (e.g., 0-2 cm).Discrepancies between AMSR-E and model estimation is possible.
Figure 8 shows the estimated soil moisture from control run and SMC-PF simulation at 20 and 50 cm.Similar to the estimates at 5-cm topic soil layer, SMC-PF simulation generates lower soil moisture than control run.Further experiments were carried out to semi-arid region USDA ARS experiment watershed.Remote sensing soil moisture measurement from AMSR-E NASA EOS Aqua satellite was used and evaluated.Comparing site observation, remote sensing measurement underestimated soil moisture at 5-cm depth, while Noah model control run overestimated soil moisture at all layers, from 5 to 100 cm).Further test of assimilating remote sensing soil moisture at top layer to the Noah model was evaluated; it is found that vertical soil moisture profile at the test point is effectively improved.
Test sites in this study are in semi-arid region with low vegetation and dry weather.Remote sensing data provide reasonably well top layer soil moisture estimation, as a result, the model assimilated retrieval also improved significantly from non-assimilated run (control run).For the region with high vegetation or large canopy, the brightness temperature received from passive microwave sensors are complicated from mixture of microwave emissions of multiple surface properties and water contents.Studies to improve microwave sensing using L-band active and passive microwave sensors are planned in the current European Space Agency (ESA) and NASA programs.It is

Fig. 4
Figure5ashows the soil moisture at 5-cm layer from observed, remote sensing AMSR-E retrieval, Noah control run, and AMSR-E assimilated.Comparing to ground measurement (red line), AMRR-E soil moisture retrieval (green line) provides reasonably well for high values but underestimated the low moisture content.Control run (black line) without assimilation of top layer soil moisture information, however, has largely overestimated the amount.Assimilation using AMSR-E soil moisture product, on the other hand, provides a better estimation than that of control run.The assimilated estimates are plotted in between the control run estimates and remote sensing observation (see blue line).Although assimilation using SMC-PF overestimate high soil moisture contents, it provides improved estimates from control run.Figure5b-dshows the observation and model estimates of soil moisture at 20-, 50-, and 100-cm depths, respectively.Overall, assimilate top layer AMSR-E soil moisture measurement has consistently improved soil moisture estimation of low layers from no assimilation control run.For a close look, the gauge observed soil moisture for soil depth at and below 50 cm is becoming stable (see Fig.5c, d).This implies that the impact of surface forcing (precipitation) to the soil moisture is less sensitive at the depth of 50 cm and below, unless heavy storm events occur.Input flow from surface can be evaporated to atmosphere before reaching to the deep soil layers.Comparing to gauge soil moisture observation, Noah model estimates are highly variable at 50-cm layer and finally show stable (flat) at 100-cm layer.Although adding soil moisture assimilation improves model soil moisture estimation, the behavior related model structure and parameter settings require further investigation.An evaluation summary is listed in Table2.Comparing with site-observations, the modeled results with remote sensing data assimilation show significant improvement in mean, RMSE, and Bias for all layers (5, 20, 50, 100 cm), as comparing to model results with control run.

Figure 6
Figure6shows surface soil moisture variation from premonsoon to post-monsoon period, July to September, from remote sensing AMSR-E estimation over Northern Mexico and Southern Arizona.Monsoon starts in May in southern Mexico and continues to move north to reach Arizona in June.Remote sensing observation from AMSR-E shows the time evolution of the wetness of the top layer soil moisture during the monsoon period.Figure7displays the comparison of spatial distribution soil moisture from AMSR-E, Noah model control run, and assimilated SMC-PF run, over the time period of Aug 14-17, 2005.AMSR-E soil moisture estimates show good agreement with observed soil moisture on the test site.Control run generate higher than observed soil moisture at test point.Spatial distribution of soil moisture from three estimations is similar, but remotely sensed (AMSR-E) estimation gives

Fig. 6
Fig. 6 Example of remote-sensed surface soil moisture on 10th, 20th, and 30th of July (top), August (middle), and September (bottom) in 2005.Star indicates the site location shown in Fig. 4. Heavy solid lines indicate the basin boundaries of the Basin.The thin line indicates the boundary between Mexico and Arizona, the US

Fig. 7 Fig. 8
Fig. 7 Top layer mean soil moisture in Aug 14-17, 2005 from a remote sensing AMSR-E estimates, b Noah LSM control run, and c SMC-PF run

Table 1
Evaluation statistics for two NRCS gauges (m 3 m -3 )

Table 2
Evaluation statistics for station at Lucky Hill, USDA ARS (m 3 m -3 )