Bayesian recursive parameter estimation for hydrologic models

The uncertainty in a given hydrologic prediction is the compound effect of the parameter, data, and structural uncertainties associated with the underlying model. In general, therefore, the confidence in a hydrologic prediction can be improved by reducing the uncertainty associated with the parameter estimates. However, the classical approach to doing this via model calibration typically requires that considerable amounts of data be collected and assimilated before the model can be used. This limitation becomes immediately apparent when hydrologic predictions must be generated for a previously ungauged watershed that has only recently been instrumented. This paper presents the framework for a Bayesian recursive estimation approach to hydrologic prediction that can be used for simultaneous parameter estimation and prediction in an operational setting. The prediction is described in terms of the probabilities associated with different output values. The uncertainty associated with the parameter estimates is updated (reduced) recursively, resulting in smaller prediction uncertainties as measurement data are successively assimilated. The effectiveness and efficiency of the method are illustrated in the context of two models: a simple unit hydrograph model and the more complex Sacramento soil moisture accounting model, using data from the Leaf River basin in Mississippi.


Introduction and Scope
The science of hydrology is occupied primarily with the processes that make up the hydrologic water cycle and the hydrologic implications of climatic and anthropogenic changes. A variety of hydrologic models have been developed to facilitate this task, i.e., to simulate the water cycle (or a portion of it) for a region of interest. While some of the parameters of these models may represent measurable attributes such as watershed area, others typically represent conceptual (effective) attributes such as mean hydraulic conductivity or rates of drainage for hypothetical lumped water storages. Therefore the numerical values of many of the parameters are not easily inferred from quantities that can be measured and must be specified by calibration. The aim of model calibration procedures is to reduce the uncertainty in the correct choice of the parameter values (parameter uncertainty) while accounting for uncertainties in the values of the measured input-output time series (data uncertainty) and uncertainties in the structural ability of the model to simulate the processes of interest (model uncertainty or structural uncertainty). The collective impact of the parameter, data, and structural uncertainties gives rise to uncertainties in the predictions made using the model (prediction uncertainty).
During the past two decades, much work has been done to understand and improve hydrologic model calibration methods. Sophisticated measures have been developed to measure the closeness of fit between the predicted and observed outputs [e.g., Sorooshian and Dracup, 1980;Sorooshian, 1981], effective and efficient global optimization strategies such as the shuffled

Bayesian Inference
The fundamental problem with which we are concerned is to predict (and to quantify the uncertainty in our predictions of) measurable hydrologic quantities (outputs) from observed hydrologic data (inputs) using a specified family of mathematical models that simulate actual input-output relations. For example, suppose that we are given the following: (1) a family of mathematical models that simulate the runoff response to rainfall for a specified watershed, perhaps together with some information about which models in the family are most plausible and (2) time series of observed input-output data (e.g., precipitation, potential evapotranspiration, and streamflow) for the specified watershed from some time t = 1 in the past to the present time T. Then a typical problem might be how to predict streamflow at time T + 1 and further into the future. This section is organized as follows. In section 2.1 we formulate problems like the above as problems of Bayesian inference for nonlinear regression. We explain why the Bayesian framework for statistical inference is admirably suited to such problems and assemble the basic mathematical concepts which will be employed. Our notation is intended to facilitate comparison with the comprehensive exposition of Bayesian methodology in the classic treatise by Box and Tiao [1973]. In section 2.2 we show how the Bayesian formulation permits hydrologists to quantify uncertainty about prediction in a natural and meaningful way. This is accomplished without recourse to calibration, which can be conceptually and/or practically problematic for certain applications. However, because hydrologists may encounter applications for which calibration is meaningful, in section 2.3 we explain how to calibrate within the Bayesian framework. We introduce a recursive formulation of the Bayesian framework in section 2.4. Our application of Bayesian methodology to problems in hydrology has certain conceptual similarities to the GLUE procedure of Beven and Binley [1992] but differs from it in certain important respects that we explore in section 2.5.

Formulation
We are interested in mathematical models that predict outputs from inputs. The models are indexed by parameters, which may (or may not) be physically interpretable. Following Box and Tiao [1973], we denote the predictive model by r/, the vector of all its parameters by 0, inputs by •, and outputs by y. Given model parameter values of 0 and input values of •, we predict output values ) = r•({• 0). (1) Given observed output values of y, we denote the residual errors from our predictions by =y -) (2) and summarize these relations by writing This is a standard formulation of nonlinear regression. Next we elaborate on the structure of the above quantities.
Assume that there are p inputs, indexed by r = 1, ..., p and s outputs indexed by s = 1, ..., m, available for time steps t = 1,..., T. Thus • denotes the value of input r, measured at time t, and y• denotes the value of output s, measured at time t. Often, we will manipulate entire arrays of these quantities, e.g., writing • and y to denote all of the inputs and outputs up to the current time T. On this basis we wish to predict the as yet unobserved outputs for the following time steps. For simplicity, we restrict this study to the next time step T + 1. Accordingly, the nonlinear regression equation of interest is Yr+l-+ ST+i.
Equation (5) contains various sources of potential prediction error: (1) measurement error in which, for example, it may be that r•( 101) is an accurate representation of the physical datagenerating process but that the devices used to measure Y r+ 1 are somewhat inaccurate; (2) parameter identification error in which it may be that r•( 101) is not an accurate representation of the physical data-generating process but that r•( 02) is accurate; and (3) model specification error in which it may be that for no 0 G © is the model r•( 10) an accurate representation of the physical data-generating process. Thus, regardless of the sophistication of the mathematical model r•, there remains uncertainty about the predictions that we use r• to make. The purpose of this paper is to explicate a formal methodology for quantifying that uncertainty.
The uncertainties which we have identified so far differ in a fundamental respect. Uncertainty about the measurements can be formalized in a familiar way: The residuals are assumed to be drawn from a suitable probability distribution. In contrast, uncertainty about the parameters (about which we will say much) and uncertainty about the model itself (about which we will say little) express degrees of belief. However, one way to quantify the latter type of uncertainty is to introduce suitable subjective probability distributions. Expressing uncertainty in terms of probabilities is one of the defining features of Bayesian inference. Once this has been done in an appropriate way, inference becomes a matter of manipulating various probability distributions according to the rules of mathematical probability. We begin by introducing a probability distribution on the possible parameter sets and denoting its density function by p(0 ). This is an unconditional distribution in the sense that it does not depend on the currently available values of the inputs g or outputs y. In the context of Bayesian inference it is usually called the prior distribution because it can be specified before any data are collected. The purpose of the prior distribution is to quantify initial uncertainty about which parameter values should be employed. Usually, the prior distribution expresses a great deal of initial uncertainty. For example, Beven and Binley [1992] suggested restricting attention to a large rectangle of parameter values and imposing a uniform prior distribution on that rectangle. However, the prior distribution can also reflect knowledge about the parameter values based on the analysis of historical data from previous events or from another system (e.g., watershed) having similar characteristics.
All of the other probability distributions with which we will be concerned are conditional distributions. We will denote their density functions by writing expressions of the form p( I ). Arguments on the left of the vertical bar denote the variables of the density; arguments on the right of the vertical bar denote the fixed values on which the density is conditioned. Context is to be inferred by examining which symbols appear in which roles.
Thus, p (y r+ •, ylg; 0) is the conditional probability density of the outputs Y o to y r+ •, given the inputs and the model param-'eters. The functional form of this density must be specified by the hydrologist, usually by selecting a plausible family of distributions for the residual errors in (5). We will have more to say about how this might be done in section 3. Once the prior density and conditional density have been specified, everything else follows automatically. The conditional density of the outputs and the model parameters, given only the inputs, is P(Yr+•, Y; 01g) =P(Yr+•, ylg; O)p(O), and the conditional density of those outputs to be predicted and of the model parameters, given all of the data that have been collected, is P(Yr+•; 01g, y) = Cp(yr+•, y; 01g), (7) where C -• -•P(Yr+•, Y; 01g)d(y). Therefore (7) gives a posterior density that quantifies all the uncertainty that remains about the outputs to be predicted and the model parameters after the information in the data has been assimilated. The derivation of this density is usually identified as Bayes theorem. What we do next depends on whether we are interested in quantifying the uncertainty in our predictions or in selecting specific values for the parameters of the model (i.e., model calibration). Notice that (7) is a joint posterior density of the outputs to be predicted and of the model parameters to be estimated. If we are interested in prediction, then we require the marginal posterior density of the outputs to be predicted; if we are interested in calibration, then we require the marginal posterior density of the model parameters to be estimated. Of course, one obtains a marginal density from a joint density by integrating the joint density with respect to the variables that are not of interest. Thus, in the Bayesian formulation of the problem, prediction and calibration are accomplished by symmetric operations. We proceed to describe these operations in greater detail.

Prediction
Prediction is the problem of describing a plausible set of values for y r+ •, the as yet unobserved outputs. To predict, we first compute the marginal posterior density of Y r+ •:

P(Yr+•]•, Y) = fo P(Yr+•; 01•, y) dO. (8)
Once this density has been computed, prediction is merely a question of identifying and reporting meaningful summaries of it. For example, one might use one of the following methods.
1. Report a single-predicted value ofy r+ • by stating some measure of central tendency of (8), e.g., its mean, median, or mode. Of course, L(O,a) = (a -0)2 is rarely an appropriate loss function for calibrating hydrologic models. The parameters may lack intrinsic meaning, in which case measuring errors by imposing a metric on the parameter space seems contrived. Furthermore, because the HPD region may be disconnected, several sets of parameter values are plausible but their mean is not. In view of these considerations, Beven and Binley [1992] persuasively argued that losses should be measured in the where X measures the badness of fit between two predicted series of outputs and w is a probability distribution that weights the possible inputs. Furthermore, we note that we could define several loss functions and construct an estimate by multiobjective optimization Yapo et al., 1998]. In this paper, we assume that all objectives of interest have been collected in a single loss function.

Recursive Inference
Bayesian inference provides a formal mechanism for combining previous information with new information. In the words of Box and Tiao [1973, p. 12], "Bayes' theorem describes, in a fundamental way, the process of learning from experience, and shows how knowledge about the states of nature represented by 0 is continually modified as new data become available." In the present context, the Bayes theorem provides a way of updating information about the hydrologic model as a recursive formula for updating information about 0. Notice that the marginal posterior density in (9) plays the same role in (15) that the prior density played in (6). Finally, notice that we can substitute (15) into (8).

Bayesian Inference and GLUE
The sequential nature of Bayes theorem follows from elementary properties of joint, marginal, and conditional probability density functions. The Bayesian paradigm, in which densities are manipulated according to the rules of mathematical probability, has many appealing consequences. For example, it guarantees that the order in which n mutually independent previous events occurred does not affect inferences about the current event. However, if the probability density functions in (13) and (15) are replaced by other quantities, then these relations cease to be meaningful updating formulae.
Such a substitution (in a somewhat less general framework) is a distinctive feature of the GLUE procedure of Beven and then l is traditionally called the likelihood function of the unknown quantities given the known quantities [Fisher, 1922]. Beven and Binley [1992] replaced the likelihood function l with a "likelihood measure" L, which they take to be a measure of how well the model predictions fit the observations [Beven and Binley, 1992, p. 281]: "We use the term likelihood here in a very general sense, as a fuzzy belief, or probabilistic measure of how well the model conforms to the observed behavior of the system, and not in the restricted sense of maximum likelihood theory .... " The GLUE methodology does not respect the rules of Bayesian inference. Formally, GLUE admits any likelihood measure L in place of the likelihood function l. Some likelihood measures use a "likelihood shape factor" N, which is set by the user to control the peakedness of the likelihood function [Binley and Beven, 1991;Freer et al., 1996;Franks and Beven, 1997;Franks et al., 1998]. In addition, a user-defined "behavioral threshold" T that separates multiple "behavioral" (accurate) from "nonbehavioral" (less accurate) simulations is commonly employed [Beven and Binley, 1992;Freer et al., 1996;Franks and Beven, 1997;Franks et al., 1998]. Both N and T demonstrate the emphasis on the use of subjective, userdefined parameters within the GLUE methodology to evaluate parameter probabilities. The use of N and T also supports the philosophy behind GLUE that there exists no optimal model due to parameter insensitivity, parameter interactions, and nonuniqueness of model structures. This is termed as "equifinality" by Beven [1993].
In 3. Beven and Binley [1992, p. 287] regarded L as "a fuzzy measure that reflects the degree of belief of the modeler." In the Bayesian framework such information is incorporated into the inference via the prior distribution of 0. When the prior distribution is derived from the sample, it is sometimes called an empirical prior distribution. Accordingly, we note the possibility of using L as an empirical prior distribution for 0.

Selection of the Error Model
Implementation of the Bayesian inference scheme requires selection of a functional form for the conditional probability density of the outputs by selecting a plausible family of distributions for the residual errors in (5). In this paper, we consider the case of a single input (p --1) and a single output (m --1). We shall assume that there exists a one-to-one and continuous invertible transformation z = #(y), such that the measurement errors in the transformed output space, given by v = g(y) -g(•)  Notice that in the case of normally distributed errors (/3 = 0), M is simply the familiar "sum of the squared measurement errors" function that is commonly minimized in model calibration. Note also that as T --> •, this distribution begins to concentrate on the global minimizer of M(0 ), i.e., on 0'. Thus our calibration procedure is a consistent estimator of 0.
Unfortunately, the use of (24) for computing the posterior density of the parameters given the data leads to numerical stability problems as T --> •; M(0) tends to a constant value when evaluated at a given 0, but the term [M(0)] -(T)(1+t3)/2 grows without bound. Further, there is a practical advantage to computing an explicit estimate for the unknown parameter Therefore we follow Box and Tiao [1973]

Case Studies
We illustrate the power and applicability of the Bayesian Recursive Estimation scheme by means of two simple case studies. The first is a synthetic study using the two-parameter Nash-cascade model [Nash, 1960]. This illustrates the ability of the BaRE method to locate (i.e., assign) the highest probability to the region of the known true parameter values. The second study explores the utility of BaRE in an operational setting involving the prediction of streamflow for an "uncalibrated" watershed using the SAC-SMA model developed by Burnash et al. [1973]. In both cases, we predict a scalar-valued output (i.e., m = 1), assume a Gaussian error model (i.e.,/3 = 0), and use a uniform distribution on 0 as the prior distribution p(0).

Study Watershed and Data
The that period is also 1324 mm, but the mean runoff is 33.60 m3/s, more than 10% higher than the l 1-year average.

Case Study I: Nash-Cascade Model
In this section, we investigate the general characteristics of the BaRE method by means of a synthetic data study using the two-parameter Nash-cascade model, a very simple unit hydrograph model. The use of synthetic output data (generated using the observed precipitation and an assumed true parameter set) allows an assessment of the prediction and parameter estimation potential of the Bayesian methodology under the controlled conditions of no errors in model structure and known properties of the data measurement error.
The Nash-cascade model is a channel routing scheme that is based on a series of linear reservoirs [Nash, 1960]. Conceptually, the inflow is successively routed through n reservoirs that all have the same recession coefficient k. Mathematically, this response function can be written as h(t)-kF(n) exp -, n>0, k>0, generate synthetic "measured" streamflow data. A variety of experiments were performed to assess the effectiveness of the BaRE methodology in terms of its ability to generate accurate and precise streamflow predictions and to locate the region of the true parameter values. We were also concerned with algorithm efficiency, particularly the amount of data required by the algorithm to converge to the known answers. To explore the sensitivity of the algorithm to the way in which the feasible parameter space is sampled, we made several runs using three different sampling strategies. The feasible space was defined by allowing n to vary from 1.0 to 3.0 and k to vary from 2.0 to 48.0 days. The initial guess for the variance • was set to 5.8 (equal to the variance of the streamflow). To reduce sensitivity to initialization, a 65-day warmup period was used during which no updating of the density functions was performed. After that, the observed data were assumed to be available at each time step. The results obtained for each sampling strategy are discussed below. In the interest of brew ity, graphical results are presented only for strategy 2. 4.2.1. Strategy 1. In this experiment the parameters n and k were sampled over the feasible space using a rectangular grid of 21 by 21 points (total 441 points) that contains the true parameter values at the midpoint of the grid. As expected, the BaRE algorithm pinpointed the true parameter values for both the "perfect data" case and the 20% measurement error case. For the perfect data case the algorithm exactly reproduced the true hydrograph, and the prediction uncertainty reduced to zero after only three time steps. For the 20% measurement error case the parameter uncertainty (and hence the associated output prediction uncertainty) became insignificant after -75 days. In addition, the 95% Bayesian confidence intervals associated with the output measurement reached a constant width that included -90% of the measured streamflow values, slightly less than the expected value of 95%. 4.2.2. Strategy 2. In this experiment the parameters n and k were sampled over the feasible space using a rectangular grid of 20 by 20 points such that the sampled set does not contain the true parameter values (a more realistic situation). In this case, the BaRE algorithm resulted in a slower but steady convergence of the HPD region into the vicinity of the true parameter values. Figure 1 shows the evolution of the HPD region of the posterior parameter probability density (in the form of one-dimensional projections, with darker shading indicating higher probability) for the case of 20% measurement error. Notice that it took -tee days to detect the most likely parameter set (a grid point just next to the true parameter values, indicated by an asterisk on the plots) and -125 days for the algorithm to assign a probability >95% to the same parameter set. Figure 2 shows three snapshots, at time steps of 50, 75, and tee days, of the posterior joint density of the two model parameters. Figure 2a shows that by day 50 the BaRE methodology has begun to assign higher probabilities to -25 points in the region having higher k and lower n than the true values (ktrue --24 days, t/true -'-2.0). By day 75 the algorithm has concentrated on 12 points arranged along an arc, indicating interaction between the parameters, but has selected two possibilities for the most likely points (Figure 2b). By day tee, one of the four grid points surrounding the true parameter values has been assigned the highest probability, although there are still two other possible competitors (Figure 2c). Figure 3 shows how the BaRE algorithm translates the parameter uncertainty into model predictions of the true and measured streamflows. Figure 3a shows the predicted and observed streamflows, and Figure 3b shows the residuals and prediction uncertainties measured in terms of the differences from the true streamflow values. The dark shaded region, representing the 95% Bayesian confidence interval for the prediction of the true streamflows, is associated with the current uncertainty in the parameter estimates. This region is seen to be relatively wide for the first 140 days, reflecting the imprecision in the parameter estimates but collapses to a line when the algorithm settles on a best parameter set. Notice that the results indicate a small tendency toward positive bias, reflecting the fact that the true parameters are not contained in the feasible sample. The light shaded region represents the 95% Bayesian confidence interval for the prediction of the streamflow measurement (i.e., the total prediction uncertainty associated with parameter, structure, and measurement errors). In contrast to the dark shaded region, this confidence interval tends toward a constant width centered on the "most likely" prediction and is found to bracket -90% of the observed data. In addition, the unreasonably large initial value assigned to • is reflected in the exponentially decreasing width for the uncertainty region during the first 30 days. In general, the impact of the choice of • decays rapidly and does not substantially affect the overall results. 4.2.3. Strategy 3. In general, the method of regular sampling from a rectangular grid is only useful for models having a small number of parameters (one to four) because the sample size in a grid-sampling scheme grows exponentially with parameter dimension. A more practical approach would involve random sampling of a predetermined number of points from the feasible space in such a way as to uniformly sample all the parameter regions of interest. Therefore, in this experiment the parameters n and k were sampled randomly over the feasible space using a uniform distribution for a total of 441 points. (Note that the sample did not include the true parameter set.) The number of points was selected to correspond to the numbers of points in the earlier strategies. The performance of this strategy Was comparable to the grid-based approach, with the algorithm converging to one of the parameter sets close to the true values after -120 days.

Case Study II: Sacramento Model
The SAC-SMA model (see Figure 4)  . Probabilistic streamflow predictions made using the Nash-cascade model (strategy 2, 20% measurement error). Solid dots denote the "measured" streamflow, the dark shaded region indicates the 95% confidence intervals for prediction of the "true" streamflow, and the light shaded region indicates the 95% confidence intervals for prediction of the measured streamflow. (a) streamflow predictions and (b) streamflow uncertainties relative to "true" flows. nash et al. [1973], has been used extensively for parameter estimation and streamflow forecasting studies [e.g., Brazil, 1988;Duan et al., 1994;Gupta et al., 1998]. This case study illustrates the usefulness of the BaRE methodology to hydrologists who wish to use the SAC-SMA model for flow forecasting on a watershed that has not yet been calibrated and for which data monitoring has only recently been initiated. Hydrologists must therefore make the best use of the limited amounts of available data, crude parameter estimates (ranges) based on nearby watersheds, and new gauge data as they become available.
In keeping with previous studies [e.g., Brazil, 1988;Yapo et al., 1996], 13 model parameters were assumed to be unknown and to have the uncertainty ranges defined in Table 1. The feasible parameter space defined by these ranges was uniformly sampled at 10,000 randomly selected locations and was assigned a uniform prior probability density function. Further, we assumed that the output errors have a heteroscedastic (nonconstant) variance that is related to flow level and which can be stabilized by the Box-Cox transformation z -(yX -1)/X with X -0.5 [see, e.g., Sorooshian and Dracup, 1980]. The model simulations were assumed to begin on July 28, 1953, with measured streamflow data only becoming available on October 1, 1953, at which point, operational forecasting and model updating using the BaRE methodology was initiated. Hence the initial 65 days represent a spin-up period to help minimize model initialization errors. From October 1 onward, BaRE was used to generate one-day-ahead streamflow predictions using the current parameter probability distribution and to update the parameter probability distribution daily as soon    suggest that too much confidence may have been assigned to the model, a point we will return to in section 5. In contrast, the width of the light shaded region, representing the 95% Bayesian confidence interval for the prediction of the streamflow measurement (in transformed output space; see Figure 5c), decreases to a relatively small value during the first 100 relatively dry days, thereafter doubling over the next 3 months of significant rainfall activity. This variation in estimated prediction uncertainty suggests that a better choice for the error model may be possible. The evolution of the parameter estimates is illustrated in Figures 6a-6d for four of the model parameters upper zone free water capacity m (uzfwm), upper zone free water withdrawal rate k (uzk), lower zone primary free water capacity m (lzfpm), and lower zone primary free water withdrawal rate k (lzpk). This plot shows a projection onto each parameter axis of the parameter values for the points belonging to the 99.99% HPD region at each time step. In the beginning, due to the uniform prior probability, all the points appear on the plot (see the first few time steps). As the streamflow data are processed, the number of points belonging to the updated HPD region decreases rapidly, leaving only a few competing parameter sets. However, the competing parameter sets appear to be scattered at different locations within the feasible parameter space, indicating that different parameterizations of the SAC-SMA model can give rise to a similar ability to predict the outputs for the limited amount of data processed so far. In particular, the HPD for the lower soil zone recession constant lzpk first tends toward the lower portion of the feasible space but later shifts toward a much higher value.
Also marked in Figure 6 is the location of the most probable (maximum likelihood) parameter set at each time step (dashed line). In the beginning, the most probable value jumps back and forth from one to another of the competing parameter sets but finally settles down to a best value at about day 200. It is interesting to note that the major jumps coincide with some of the large storm events. The similarity of this jumping phenomenon to that observed in strategy 3 of the synthetic case study suggests that a possible cause is insufficient density of sampling in the region of good parameter values. However, one should also consider the possibility that the assumed models (i.e., hydrologic model, error model, and transformation model) do not adequately represent the observed input-output process. Finally, Table 1 shows a comparison of the l 1-year forecast performance and final parameter estimates obtained using BaRE in on-line mode with two conventional off-line batch calibration runs, one reported by Brazil [1988] and the other performed by us. This comparison is possible because all three studies involve calibration of the SAC-SMA model to the Leaf River basin using the 11-year period WY 1953-1963. Brazil [1988] used a three-stage interactive multilevel calibration procedure that combined manual and automatic methods, while we used the SCE-UA global optimization procedure developed by Duan et al. [1994]. Note that while the batch methods have the advantage of being able to employ sophisticated search procedures to actively refine the location of the parameter estimates, the current version of the BaRE procedure is limited to selecting from a fixed set of randomly specified points distributed rather coarsely throughout the feasible space. In spite of this, the BaRE method provided forecasts with an overall RMSE prediction error that is only slightly larger than that given by the Brazil [1988] results. Furthermore, the parameter differences are not very large (compared to the initial uncertainty ranges), with the exception of uzk, maximum percolation rate coefficient (zperc) (dimensionless), and percolation equation exponent (rexp) (dimensionless), which are known to be generally less sensitive. It is interesting to note that the BaRE method converged to its final parameter estimates after processing only -1.5 years of data. This is impressive, considering that at least 11 years of data are generally considered necessary for obtaining reliable parameter estimates via conventional batch calibration.

Summary and Discussion
This paper has discussed a Bayesian probability-based approach to simultaneous parameter estimation and output uncertainty prediction via on-line recursive processing of time series data as they become available. A practical algorithm, called the Bayesian Recursive Estimation (BARE) method, has been developed and tested. BaRE differs from conventional calibration and prediction methods in that it employs a recursive scheme for tracking the conditional probabilities associated with several competing parameter sets (models) in an on-line mode instead of searching for a single best solution in an off-line mode. The parameter probabilities are used to compute probabilistic predictions of the desired output variables. Probability updating, via Bayes theorem, facilitates the assimilation of new data as they become available.
The most obvious practical advantage of the BaRE approach is that it enables hydrologists to generate consistent model predictions, along with estimates of the uncertainty in those predictions, even if historical hydrologic data are not available for model calibration using conventional methods. While the initial predictions (and parameter estimates) may be crude, their quality can be refined recursively with time as data become available. The approach is therefore directly applicable when a watershed has only recently been gauged. However, this approach could also be extremely useful where the time and expertise required for conventional calibration are not available. In such situations the parameter estimates provided by BaRE could help to provide guidance to later model calibration efforts. Furthermore, we believe that the BaRE pro-cedure can be a very efficient and valuable tool for the refinement of parameters defined through regionalization, i.e., the transfer of parameters from a calibrated watershed to another watershed with similar hydrologic characteristics. In this case, the regionalized parameter values can be used to determine parameter ranges and prior distribution. This distribution can then be altered successively by BaRE to reflect the subtle differences between the calibrated and regionalized watershed.
The two hydrologic case studies presented in this paper serve to illustrate the effectiveness and efficiency of the BaRE methodology, as well as some weaknesses of the current implementation. The first (synthetic data) study used a simple unit hydrograph model to show that the BaRE algorithm does indeed provide consistent one-step-ahead probabilistic predictions of the true and measured streamflow values. At the same time, the algorithm is able to quickly and precisely locate the region of the known true parameter values, even when the data contain significant amounts of noise. The second (real data) case study demonstrated the applicability of BaRE in the context of the SAC-SMA model used by the U.S. National Weather Service for operational streamflow forecasting. Although the SAC-SMA model cannot be considered to be a perfect representation of the watershed process for the Leaf River basin, the BaRE method produced good one-step-ahead flow predictions with reasonable uncertainty estimates, with an accuracy that compares well with traditional batch calibration methods. In both cases the algorithm was relatively insensitive to the initial estimate • for variance of the errors (for practical applications, we suggest setting • at -25-50% of the estimated long-term variance of the streamflows). Furthermore, it was somewhat surprising that BaRE converged to its final most likely parameter estimates after processing only a relatively small amount of data. The consistency of this finding needs to be evaluated through further studies.
Our experiences with the current version of the BaRE algorithm suggest that there are several ways in which the implementation can be improved. The first is related to the phenomenon where the current most likely parameter set jumps large distances in the feasible parameter space. While this may be a natural consequence of parameter insensitivity, interaction, or nonuniqueness of the model structure, we suspect that a probable cause is a combination of uneven and insufficient density of sampling in the region of good parameter values. It is interesting to note that after all the available data have been processed, the final most likely BaRE parameter estimate is theoretically equivalent to the value that would have been obtained by conventional batch calibration using pure random search. However, state-of-the-art batch calibration methods iteratively.refine the search to focus on the most promising region of the parameter space. Thus we are exploring ways to implement progressive resampling in order to concentrate the samples in the current HPD region while terminating computations in the nonproductive portions of the parameter space. At the same time, special procedures such as Latin hypercube sampling [e.g., Ye, 1998] can be implemented to control sample evenness.
The second area of improvement is related to the selection of the output transformation and error models. The present algorithm assumes that the correct values for /3 and X are known and that the error variance in the transformed output space is constant. However, in the case of the SAC-SMA study the 95% Bayesian confidence interval for the prediction of the streamflow measurement varied substantially, suggesting the need to investigate better choices for the transformation and error model parameters. We recognize, therefore, the need for objective procedures to guide the selection of the appropriate transformation and error models.
Finally, the SAC-SMA case study also revealed that the model predictions tended to be unbiased during the early portion of the run when the parameter HPD region had not yet collapsed to a single point. However, during the later period, when the algorithm had decided on a single best parameter set, the residuals showed a tendency toward systematic positive bias. We speculate that the performance of the BaRE methodology might be improved by preventing the parameter HPD region from collapsing to a single point. We believe that the algorithm might overvalue the information in the most recent data, and we are currently testing different ways in which this could be prevented, such as the incorporation of a "forgetting factor" into the parameter probability updating rule. Such a strategy may also have the additional benefit of allowing the algorithm to track slowly varying changes in the hydrologic behavior of the watershed (e.g., due to deforestation). Our findings related to these potential algorithmic improvements will be reported in future papers.