Hydrologic evaluation of satellite precipitation products over a mid-size basin

Since the past three decades a great deal of effort is devoted to development of satellite-based precipita- tion retrieval algorithms. More recently, several satellite-based precipitation products have emerged that provide uninterrupted precipitation time series with quasi-global coverage. These satellite-based precip- itation products provide an unprecedented opportunity for hydrometeorological applications and climate studies. Although growing, the application of satellite data for hydrological applications is still very lim- ited. In this study, the effectiveness of using satellite-based precipitation products for streamﬂow simulation at catchment scale is evaluated. Five satellite-based precipitation products (TMPA-RT, TMPA-V6, CMORPH, PERSIANN, and PERSIANN-adj) are used as forcing data for streamﬂow simulations at 6-h and monthly time scales during the period of 2003–2008. SACramento Soil Moisture Accounting (SAC- SMA) model is used for streamﬂow simulation over the mid-size Illinois River basin. The results show that by employing the satellite-based precipitation forcing the general streamﬂow pattern is well captured at both 6-h and monthly time scales. However, satellites products, with no bias-adjustment being employed, signiﬁcantly overestimate both precipitation inputs and simulated streamﬂows over warm months (spring and summer months). For cold season, on the other hand, the unadjusted precipitation products result in under-estimation of streamﬂow forecast. It was found that bias-adjustment of precipitation is critical and can yield to substantial improvement in capturing both streamﬂow pattern and magnitude. The results suggest that along with efforts to improve satellite-based precipitation estimation techniques, it is important to develop more effective near real-time precipitation bias adjustment techniques for hydrologic applications.


Introduction
Precipitation is the key input for hydrometeorological modeling and applications. For accurate flood predictions, reliable quantification of precipitation data is crucial. However, in many populated regions of the world including developing countries, ground-based measurement networks (whether from radar or rain gauge) are either sparse in both time and space or nonexistent. This situation restricts these regions to manage water resources and hampers early flood warning systems resulting in massive socioeconomic damages.
With suites of sensors flying on a variety of satellites over the last three decades, many satellite-based precipitation estimation algorithms have been developed to make the precipitation data available to the community in quasi-global scale. Several high resolution precipitation products are now operational in high resolution at quasi-global scale. Among those are the TRMM Multisatellite Precipitation Analysis (TMPA; Huffman et al., 2007), the Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks (PERSIANN; Hsu et al., 1997;Sorooshian et al., 2000), Climate Prediction Center (CPC) morphing algorithm (CMORPH; Joyce et al., 2004), and the Naval Research Laboratory Global Blended-Statistical Precipitation Analysis (NRLgeo; Turk et al., 2000). Although different in the precipitation estimation procedure, in all of the listed products a combination of information from infrared and microwave sensors on geostationary and low earth orbiting satellites are used in attempt to improve the consistency, accuracy, coverage, and timeliness of high resolution precipitation data.
Given different estimation techniques and the existing uncertainties in retrieving precipitation characteristics from satellite information Adler et al., 2001;Ebert et al., 2007;Gottschalck et al., 2005;AghaKouchak et al., 2009;McCollum et al., 2002;Tian et al., 2007), studies on reliability of hydrologic predictions based on the satellite-derived precipitation data need to be continued. One useful feedback of such studies is to assess the applicability of satellite-based streamflow prediction for data sparse regions. These types of studies are also motivated by global decline of in situ networks for hydrologic measurements (Stokstad, 1999;Shiklomanov et al., 2002) as opposed to the growing trend in the availability of satellite sensors providing more frequent and more accurate precipitation-relevant information and also near future mission such as the Global Precipitation Measurement (GPM) missions among others. In concert with such developments, great deals of research are being conducted to improve quality and resolution of precipitation products from individual or combination of sensors (e.g., Behrangi et al., 2010b among others).
Several previous studies estimated streamflow by using hydrologic models with inputs obtained from remotely sensed data (Hong et al., 2006;Hossain and Anagnostou, 2004;Yilmaz et al., 2005). Schultz (1996) proposed a model to reconstruct monthly runoff estimates based on the geostationary satellite data, and applied a hydrologic model to obtain flood hydrographs. Tsintikidis et al. (1999) evaluated the feasibility of satellite-derived mean areal precipitation estimates for hydrologic application across northern Africa. Using Meteosat inferred precipitation data Grimes and Diop (2003) predicted streamflow estimates and concluded that inclusion of numerical weather model outputs might improve the estimated flood hydrographs. Nijssen and Lettenmaier (2004) investigated the effect of satellite-based precipitation sampling error on estimated hydrological fluxes. Using TMPA data, Su et al., 2008 investigated the feasibility of satellite-based precipitation data for hydrologic predictions. They concluded that satellite estimates have potential for hydrologic forecasting particularly with respect to simulation of seasonal and inter-annual stream-flow variability.
This study aims to assess the use of available near real-time operational precipitation estimation products in streamflow forecasting. The objective of present study is threefold. First, how does precipitation estimation from satellite data using different algorithms and ground multi-sensor product compare at a mid-range size basin. Second, assuming that the hydrologic model generates reliable streamflow estimations, how differences in input precipitation characteristics among different products are reflected in resulting streamflow hydrographs at the time scale (usually 6 h) used by the NWS. The results provide insights on needed accuracy for precipitation input. Finally, evaluation of precipitation inputs with respect to ground-based streamflow observations at watershed outlet can provide a secondary check, particularly for hydrologic applications.
The paper consists of 5 sections. In Section 2, case study specifications including period and area of study, description of hydrologic model, and datasets are provided. Method and model calibration are described in Section 3. Section 4 outlines the results and discussion of findings. Finally, concluding remarks are presented in Section 5.

Period and area of study
The experiment is performed using 6 years of data (2003)(2004)(2005)(2006)(2007)(2008) over the Illinois River basin located upstream of USGS gauging station (07195430) south of Siloam Springs, Arkansas (Fig. 1). The watershed (hereafter referred to as the Siloam basin) has been utilized as a test basin for the Distributed Modeling Inter-comparison Project (DMIP). The size of the Siloam basin is typical of the size used as an operational forecasting unit by NWS (Smith et al., 2004) and occupies 1489 km 2 . Elevation ranges from 285 m at the outlet to 590 m at the highest and the basin's land cover can be described as uniform with approximately 90% of the basin area being covered by deciduous broadleaf forest with the remainder being mostly woody. The dominant soil types in the basin are silty clay (SIC), silty clay loam (SICL), and silty loam (SIL).The average annual rainfall and runoff of the basin are about 1200 and 300 mm/year, respectively (Smith et al., 2004). Siloam basin is free of major complications such as orographic influences, significant snow accumulation, and stream regulations (Smith et al., 2004).

Description of the hydrologic model
The SACramento Soil Moisture Accounting (SAC-SMA) (Burnash et al., 1973;Burnash, 1995) is used to model the rainfall-runoff process. SAC-SMA is a lumped, conceptual model and is being used as the core component of the National Weather Service (NWS) River Forecasting System (NWSRFS) for rainfall-runoff modeling at the basin scale. Mean Areal Precipitation (MAP) and potential evapotranspiration are forcing data for the model to generate runoff response components. The model consists of an upper-zone representing the uppermost soil layer and a lower-zone representing the deeper portion of the soil profile. Each zone includes tension and free water storages. Depending on the status of the upper-zone free water and the deficiencies in the lower-zone storages, the percolation rate from the upper to the lower layer is controlled through a non-linear process. The model has sixteen parameters and six soil moisture states and generates five runoff response components as following: (1) direct runoff resulted from falling precipitation on permanent and temporary impervious areas, (2) surface runoff generated when the precipitation rate is greater than percolation rate, (3) interflow, which is the lateral outflow from the upper-zone free water storage, (4) supplementary base flow, which is the lateral drainage from lower-zone supplementary free water storage, and (5) primary base flow, which is the lateral drainage from the lower-zone primary free water storage. The summation of runoff components is then convolved with the unit-hydrograph of the basin's outlet to generate the streamflow at this location.

Datasets
The dataset used in this study consists of precipitation forcing from five satellite-based products along with the reference ground multi-sensor precipitation data, potential evaporation, and streamflow observations at basin's outlet. The satellite-derived precipitation products utilized in the present study are: (1) TMPA real-time (hereafter referred to as TMPA-RT), collecting available microwave-derived precipitation estimates from various satellites within a time bracket of 3 h for each cell on a 0.25 Â 0.25-degree grid and then fills the gaps with microwave-calibrated infrared estimates, (2) PERSIANN, using artificial neural networks to establish relationships between infrared data and rain rate after real-time adjustment of network weights based on available microwavederived rain rates, (3) CMORPH, estimating a temporally and spatially complete precipitation field, exclusively from microwave observations through guided propagation of precipitation estimates between two microwave images using infrared-based cloud tracking, (4) TMPA bias adjusted (hereafter referred to as TMPA-V6), and (5) PERSIANN bias adjusted (hereafter referred to as PER-SIANN-adj).
As discussed by Huffman et al. (2007), from 1 January 1998 to the end of March 2005, the TMPA-V6 utilizes the Global Precipitation Climatology Center (GPCC) 1.0°Â 1.0°monthly monitoring product and since then uses Climate Assessment and Monitoring System (CAMS) 0.5°Â 0.5°monthly gauge analysis to bias adjust the 3-h reprocessed and initially processed TMPA estimates respectively. Note that besides precipitation bias, TMPA-V6 differs from TMPA-RT as in TMPA-V6; TRMM Combined Instrument (TCI) precipitation estimate is used to calibrate rain estimates from other microwave sensors while the TRMM Microwave Imager (TMI) precipitation estimate is the calibrator in real-time product (G. Huffman and D. Bolvin, 2010, personal communications). PER-SIANN-adj is obtained by computing a correction factor (a) as the ratio of GPCP rainfall and PERSIANN rainfall at 2.5°grids at monthly scale. The monthly bias is then spatially downscaled and removed from PERSIANN 0.25°resolution estimates using the correction factor a. GPCP monthly rainfall inherently considers gauge measurement and several satellite-based rainfall and model estimates (Adler et al., 2003). PERSIANN-adj maintains total monthly precipitation estimate of GPCP, while retains the spatial and temporal details made available through PERSIANN estimate (0.25-degree and hourly). The hourly 0.25-degree lat/long PERSI-ANN-adj data together with the listed satellite and multi-sensor precipitation products are integrated from their original resolution onto a common 6-h and monthly 0.25 Â 0.25°resolution to be used in the study time scales.
The reference precipitation estimates are obtained from the standard NWS Multi-Sensor Precipitation Estimates (MPE -NEX-RAD and gauge) data. The dataset was made available to DMIP 2 participants in the Hydrologic Rainfall Analysis Project (HRAP) grid format at 1-h temporal and 4 km Â 4 km spatial resolution. Siloam basin is well inside two radar umbrella and several studies in the past have analyzed the quality of the NEXRAD precipitation estimates in this basin and surrounding areas (Smith et al., 2004). Note that for the period of the study, Siloam basin lacks continuouslyavailable dense network of rain gauges and as such the combined NEXRAD-gauge data was solely used as precipitation reference as it may provide the best possible approximation of the true areal average rainfall values.
Hourly streamflow observation data at the basin's outlet were obtained from the USGS local office. Some quality control of the provisional hourly data obtained from the USGS was performed at the NWS Office of Hydrologic Development (OHD). Quality control was a manual and subjective process accomplished through visual inspection of observed hydrographs. The suspicious portions of the hydrograph were simply set to missing (Smith et al., 2004). The reference hourly USGS streamflow observation and hourly average multi-sensor precipitation rates are converted to 6-h and monthly time scales to be used for calibration and evaluation of the results.
Climatic monthly mean values (in mm/day) of potential evaporation (PE) demand were also obtained through DMIP 2. As stated by Smith et al. (2004), these values are derived using information from seasonal and annual free water surface (FWS) evaporation maps in NOAA Technical Report 33  and mean monthly station data from NOAA Technical Report 34 .

Calibration of the hydrologic model
In order to generate a more reliable streamflow forecast, the parameters of the SAC-SMA model need to be calibrated. In this study, the calibration procedure is performed separately for each individual satellite product and multi-sensor data using the wetter half (2006)(2007)(2008) of the available dataset (2003)(2004)(2005)(2006)(2007)(2008), with 2006 dataset repeated for the spin-up period. The selection of wetter half period for calibration is based on our expectation that this period may result in excitement of greater number of the SAC-SMA parameters. The remaining dataset (2003)(2004)(2005) was used for verification of the results. Excess rainfall calculated from SAC-SMA model is convolved with 6-h unit hydrograph to generate 6-h streamflow comparable to the 6-hourly accumulated streamflow observation at the basin's outlet. Note that the 6-h unit hydrograph is constructed from the 1-h unit hydrograph provided by DMIP 2 using S-curve method (McCuen, 2004).
In lumped implementation, SAC-SMA has 13 major parameters that cannot be measured directly and need to be identified through a proper calibration (parameter optimization) scheme. The Shuffled Complex Evolution-Univ. of Arizona (SCE-UA; Duan et al., 1992) algorithm in conjunction with the Multi-step Automatic Calibration Scheme (MACS; Hogue et al., 2000) is used to calibrate the model parameters. The SCE-UA is a robust and efficient optimization algorithm for calibration of complex conceptual hydrologic models (Duan et al., 1992;Cooper et al., 1997;Kuczera, 1997;Thyer et al., 1999). SCE-UA utilizes the simplex method of Nelder and Mead (1965), a random search procedure, and complex shuffling (Duan et al., 1992) in order to obtain the global optima. The MACS procedure suggests a sequential implementation of various objective functions alleviating some of the shortcomings associated with using a single objective function (Hogue et al., 2000;Gupta et al., 1998). In brief, MACS consists of the following sequential steps: (1) Calibrate all parameters of the SAC-SMA model using LOG objective function (Eq. (1)) to put more emphasis on estimation of the lower-zone parameters, (2) Optimize the SAC-SMA upper-zone and percolation parameters using root-mean squared error (RMSE) objective function (Eq. (2)) to improve the simulation of the peak flows, and (3) Maintain the upper-zone parameters and emphasize on optimization of lower-zone parameters using the LOG objective function. The LOG and RMSE objective functions are defined as below: where Q sim,t and Q obs,t are simulated and observed streamflows at time step t, and n is the total number of the streamflow pairs allocated to model calibration.

Evaluation statistics
In order to analyze the performance of the satellite-based precipitation products for streamflow forecasting, it is important to also evaluate the skill of individual satellite precipitation products with respect to the reference precipitation data. Therefore, the evaluations are performed for both precipitation inputs and corresponding streamflow simulations and the outcomes are crosscompared. In the present study, the precipitation/streamflow evaluations are conducted at both 6-h and monthly time scales through visual inspection of rainfall-runoff quantities along with statistical measures. The two different time scales facilitate to assess the non-linear rainfall-runoff process as well as to investigate the dependence of statistical measures on seasonality, and long term characteristics of precipitation and streamflow regimes. Statistical measures used in this study are defined in Appendix A and include correlation coefficient (COR), root-mean square error (RMSE) and percent bias (BIAS).
For more detail evaluation of the precipitation products and generated streamflows, four additional statistical measures are calculated from the contingency table (see categorical statistics in Appendix A): probability of detection (POD), false alarm ratio (FAR), areal bias (BIASa) and equitable threat score (ETS). The construction of the contingency table is based on identifying binary (0/ 1 or Yes/No) values of precipitation/streamflow occurrence. This is accomplished by selecting a threshold above which a rain event (for example) would be considered to have occurred. By using a range of thresholds, the statistical measures derived from contingency table yield information on the product's ability to capture precipitation/streamflow occurrences at different rates. POD and FAR range from 0 to 1, with perfection represented by a POD of 1 and a FAR of 0. POD is sensitive to number of pixels correctly classified as precipitation (Hits). FAR, on the other hand, is sensitive to number of pixels incorrectly classified as no-precipitation (False alarm). As a result, a low POD can be increased by increasing the predicted rain coverage, but such improvement would be at the cost of increasing false alarms. A value of 1 for BIASa indicates that predictions and observations have identical area coverage independent of location. The ETS ranges between À1/3 and 1 with perfection represented by ETS of 1. It allows the scores to be compared ''equitably'' across different regimes (Schaefer, 1990) and is insensitive to systematic over-or under-estimation.  (2003)(2004)(2005)(2006)(2007)(2008) for reference multi-sensor precipitation (Fig. 2a) and other precipitation products (Fig. 2b-f). Visual inspection of precipitation rates and pattern in conjunction with quantitative statistics, reported at the top-right corner of each panel, demonstrates high agreement between satellite products and the reference multi-sensor data. The satellite products with no biasadjustment  agree well among themselves as quantified by COR, RMSE and BIAS ranging between (0.66-0.79), (0.51-0.61 mm/h), and (27.6-40.1%), respectively. The two monthly bias-adjusted satellite products ( Fig. 2e and f) is very alike with substantial improvement compared to their near real-time counterparts. Fig. 2 also suggests that the satellite products with no bias-adjustment show a strong tendency to overestimate intense precipitation events. As expected, after bias adjustment, the over and under-estimations are significantly reduced with overall statistics demonstrating negligible BIAS for TMPA-V6 (1.7%) and PER-SIANN-adj (6.2%).

Evaluation of precipitation forcing
While Fig. 2 provides information about precipitation intensity and its distribution throughout the period of study, it does not clearly demonstrate the ability of the products to capture the occurrence of precipitation events. Presumably capturing the occurrence of precipitation is important because even small amount of precipitation can affect the initial soil moisture conditions, which subsequently impacts the model's streamflow generation. Fig. 3 shows precipitation occurrence at a range of precipitation intensity thresholds. For example, if a threshold of 1 mm/6 h is selected to detect precipitation events, ETS, POD, FAR, and BIASa scores can easily be calculated based on the contingency table (see Appendix A) constructed for 1 mm/6 h threshold. Similarly, by selecting other precipitation thresholds (e.g., 2, 5, 10, 15, 20, and 30 mm/6 h) skill of satellite products to capture low, medium and intense precipitation events can be analyzed.
As shown in Fig. 3, the overall precipitation detection skill of the precipitation products (e.g., based on ETS) diminishes as precipitation threshold increases, meaning that the satellite products are less skillful to capture the correct magnitude of intense precipitation events. For the satellite products with no bias-adjustment (Fig. 3d), as the precipitation threshold increases, BIASa significantly increases. This suggests that during intense precipitation events, the number of grid-boxes incorrectly classified as rain (False alarm) tends to be substantially larger than the number of grid-boxes incorrectly classified as no-rain (Misses). The bias-adjusted products show skillful by maintaining a BIASa value around 1, indicating that the total area of precipitation events is well captured, particularly at thresholds less than 10 mm/h. It is worth reminding that a perfect BIASa score does not necessarily indicate Fig. 3. Binary analysis of precipitation occurrence using ETS, POD, FAR, and BIASa scores at a range of precipitation intensities. For the thresholds 1, 2, 5, 10, 15, 20, and 30 mm/h, the total number of rain samples based on multi-sensor data are 846, 635, 391, 188, 123, 67, and 24 respectively. The total number of rain and no-rain samples is 8768. a perfect match between precipitation/no-precipitation grid-boxes of observed and predicted fields. CMORPH demonstrates high skills in detecting precipitation events across the entire range of precipitation intensities (see Fig. 3b). However, similar to TMPA-RT, it significantly overestimates the intense precipitation areas (Fig. 3d). From Fig. 3 and based on ETS, CMORPH outperforms all satellite products, including those that are bias-adjusted, in delineation of precipitation areas within an intensity range of less than 15 mm/6 h. As discussed by Behrangi et al. (2009) and Behrangi et al. (2010a), one reason for this could be due to inability of infrared based precipitation estimation algorithms (e.g., PERSIANN and partially TMPA) to: (a) capture warm rainfall and (b) screen out norain thin cirrus clouds that are usually very cold. The first shortcoming may result in significant under-estimation of the total volume of rainfall, while the latter may result in assigning precipitation to areas with no-precipitation. Fig. 4 shows the time series of monthly precipitation rates for all of the precipitation products. Besides the visual inspection, the statistical measures at top-right corners of the Fig. 4a-e suggest that all satellite products can capture the general precipitation pattern at monthly scale. The monthly averaged precipitation rate mostly peaks during spring and early summer and reaches its minimum value during late fall and early winter. During spring and summer months unadjusted satellite products TMPA-RT, CMORPH & PERSIANN tend to significantly overestimate the amount of precipitation . While slight under-estimation of precipitation amount is observed during the cold seasons, as expected, bias-adjusted products (Fig. 4d and e) demonstrate substantial improvement in capturing the monthly variation and total amount of precipitation throughout the study period. The statistical measures reported in Fig. 4 indicate that for the satellite products with no bias-adjustment (Fig. 4a-c), COR and RMSE range between 0.74 to 0.81 and 0.08 to 0.09 mm/h, respectively. BIAS values are similar to Fig. 2 (6-h data). The two bias-adjusted products appear almost identical and to a large extent resemble the multi-sensor precipitation estimates at monthly scale (COR = 0.92 and RMSE = 0.03 mm/h and negligible BIAS). One reason for this is that the bias-adjusted products are not independent of the MPE data since both are gauge adjusted. However, the adjustment of MPE and satellite-based products are performed over different spatial scales and temporal periods. For additional information, readers are referred to Huffman et al. (2001) and Young et al. (2000). Fig. 5 supplements Fig. 3 by demonstrating the monthly performance of the precipitation products in capturing precipitation events as identified with precipitation threshold of 1 mm/6 h. For each month, the statistical measures are calculated from the pool of all 6-h pairs of satellite-multisensor rain intensities. Fig. 5 clearly displays that the satellite products tend to overestimate precipitation area during warm months while demonstrating considerable under-estimation during cold months (e.g., see TMPA-RT in Fig. 5d). Despite TMPA-RT and CMORPH, PERSIANN does not show under-estimation of precipitation for the first two months of the year (Fig. 5d) which could be due to its difficulty in removing grid-boxes that incorrectly classified as rain, as suggested by FAR (Fig. 5c).

Evaluation of streamflow forecast
Six-hour streamflow hydrographs generated from the individual 6-h satellite and multi-sensor precipitation inputs are compared to streamflow observations in Fig. 6. For better visualization of streamflow peaks along with low flows (e.g., recession parts), the hydrographs are transformed using the following transformation proposed by Hogue et al. (2000). The transformation has also been used by Yilmaz et al. (2005) and Khakbaz et al. (2009): Visual inspection of the hydrographs reveals that satellite precipitation products result in reasonable capture of streamflow discharge including extreme cases and their timings. However, if not bias-adjusted (Fig. 6b-d), satellite products demonstrate significant overestimation of peakflows extending to the recession periods. Fig. 6 indicates that while severe streamflows may occur in any season, satellite products tend to result in overestimation of the flood magnitudes mostly in spring and summer time. To some extent, this is consistent with the input precipitation analysis in Section 4a. Table 1 compares the statistics for input precipitation and resulted streamflow during calibration and validation periods. Performance measures in Table 1 along with streamflow hydrographs, displayed in Fig. 6, suggest that the overestimation of streamflow is more significant for CMORPH and TMPA-RT during both calibration and validation periods with streamflow BIAS values exceeding 50%. In both calibration and validation periods the PERSIANN-based streamflow presents better BIAS and RMSE scores than CMORPH and TMPA-RT, but worse COR as compared to CMORPH. The streamflows generated from bias-adjusted products (multi-sensor, TMPA-V6 and PERSIANN-adj) fairly well capture the streamflow magnitude and timing during both calibration and validation periods. This highlights that bias-adjustment is a crucial step to improve the overall skill of the satellite products for streamflow predictions. Note that the statistical measures reported in Table 1 indicate that the prediction skill is generally higher during validation period than during the calibration period. One reason for this could be due to the existence of a more complicated streamflow hydrograph with many extreme streamflows during the calibration period as compared to the validation period.
More detailed analysis of the simulated streamflows can be obtained from Fig. 7 where 6-h streamflows are cross-compared using categorical statistics calculated from contingency table within a range of streamflow thresholds. The left and right columns provide the statistical measures for calibration and validation periods, respectively. Fig. 7a and b shows that the POD decreases as the threshold increases. This indicates that using the satellite-based products for streamflow simulation may result in a substantial loss of skill to detect severe flood events. Fig. 7b and c indicate that bias-adjusted products produce smaller number of incorrectly identified streamflows (FAR) as compared to those that are not bias-adjusted. The improved FAR in bias-adjusted cases is more remarkable at higher streamflow threshold which is consistent with the results for the precipitation inputs (Fig. 2c). Fig. 7c and d demonstrate that satellite products with no bias-adjustment generally result in significant overestimation of river discharge during extreme streamflow cases. However, the bias-adjusted products tend to cause under-estimation of the extreme streamflows presenting considerable decline in streamflow BIASa at higher thresholds (Fig. 7e and f). Fig. 7 together with Fig. 6 and Table 1 demonstrates that the simulated streamflow using the multisensor precipitation product is superior to those obtained from other products including bias-adjusted ones Such superiority is more marked in FAR, COR, and RMSE measures. Fig. 8 shows the time series of monthly averaged 6-h streamflows generated for individual satellite and multi-sensor precipitation products. The monthly averaged 6-h observed streamflow is also shown in each panel to serve as reference for comparison. Similar to Fig. 6, the hydrographs are transformed using Eq. (1) for more clear demonstration of peaks and low flows together in a single plot. Fig. 8 shows that during spring and summer months, the monthly streamflows are mostly overestimated by satellite products. This is consistent with the previously reported monthly precipitation input and reported 6-h streamflow analysis. Table 2 Fig. 5. Monthly performance of the precipitation products in capturing precipitation events as identified with precipitation threshold of 1 mm/6 h. The total number of samples used for analysis range between 680 and 740.
provides monthly scale statistical measures for individual precipitation input and corresponding streamflow predictions, separately for calibration and validation periods. The monthly COR for input precipitation and predicted streamflow range between 0.76-0.91 and 0.70-0.96 during calibration period and between 0.70-0.95 and 0.62-0.92 during validation period, respectively. Simulated streamflows from multi-sensor precipitation inputs provide the highest COR (exceeding 0.92) at basin's outlet followed by  TMPA-V6 and PERSIANN-adj. Among satellite products with no bias-adjustment, CMORPH provides the highest COR (exceeding 0.73). Table 2 indicates that by introducing the bias-adjusted precipitation products to the SAC-SMA model the RMSE of simulated streamflow is reduced significantly. TMPA-RT and CMORPH demonstrate an overall highest RMSE in simulated streamflow during both calibration and validation periods. Fig. 9 displays monthly COR, RMSE, and BIAS for input precipitation and output streamflow predictions across all 12 months. Both calibration and validation datasets are included in Fig. 9. The figure suggests the followings: (1) the order of merit for precipitation products is not necessarily preserved in corresponding output streamflow predictions. (2) During spring and summer months significant overestimation of both precipitation inputs and resulted streamflows are observed for the satellite products with no bias-adjustment ( Fig. 9e and f). For colder months the unadjusted satellite products demonstrate slight under-estimation. (3) The satellite bias-adjusted products perform alike across different months with slight under-estimation of observed streamflow, particularly during cold months. (4) While satellite products demonstrate fairly consistent COR across different months of year, the resulted streamflows show a significant decline in COR during warm months. This shows that the observed streamflow pattern during warm months is not well captured by the simulated streamflow using satellite precipitation input. One reason for the observed inconsistency in COR results might be due to significant overestimation of precipitation in conjunction with non-linearity of the rainfall-runoff process. (5) RMSE is higher during warm months and is declined during cold months. (6) Overall, the streamflow generated from multi-sensor precipitation product outperforms all other streamflow predictions fairly consistently across all months. It is worth noting that quantifying a true relationship between rainfall and runoff skills is not straightforward. The main issue is precipitation statistics are calculated with respect to the multi-sensor product while the statistics for predicted streamflows are based on streamflow observation and these two references are inherently different.

Summary and conclusions
Satellite-based precipitation data are viable sources of precipitation, particularly for regions with poor or nonexistent groundreference measurements. Despite global coverage and uninterrupted availability, satellite data are not commonly integrated into operational hydrologic modeling mainly due to lack of information on the reliability of such products at basin scale. Over a mid-sized basin, 6 years of 5 satellite-based precipitation products namely TMPA-RT, TMPA-V6, CMORPH, PERSIANN and PERSIANN-adj are first evaluated with respect to multi-sensor (NEXRAD and gauge) dataset. The precipitation products are then introduced to the lumped SAC-SMA rainfall-runoff model to generate streamflows at 6-h and monthly time scales and the results are compared to streamflows measured by gauge. Statistical analysis indicates that the bias-adjusted satellite precipitation products agree well with gauge-adjusted radar, compared to their counterparts with no Fig. 7. Binary analysis of 6-h streamflows using ETS, POD, FAR, and BIASa at a range of streamflow thresholds. The left-side and right-side panels display the analysis during calibration and validation periods, respectively. At the calibration period and for the thresholds 2, 3, 5, 7, 9, and 11 cm (log scale), the total number of samples (based on streamflow measurement) that exceed the threshold values are 3599, 2395, 723, 245, 109, and 52, respectively. Similarly, for the validation period the number of samples (based on streamflow measurement) that exceed the threshold values are 3525, 2166, 418, 140, 69, and 34, respectively. The total numbers of streamflow measurements for the calibration and validation period are 4334 and 4362, respectively. bias-adjustment, particularly to capture timing, occurrence and magnitude of precipitation events. The satellite precipitation products with no bias-adjustment (TMPA-RT, CMORPH & PERSIANN) tend to overestimate intense precipitation events quite significantly, particularly during warm months. Reported POD and FAR values indicate that during intense precipitation events more grids are incorrectly classified as rain comparing to those grids that are incorrectly classified as no-rain. In both 6-h and monthly time scale CMORPH demonstrates generally higher skill to delineate precipitation area and the estimated precipitation rate correlates  better with multi-sensor precipitation. On the other hand PERSI-ANN generally better estimates the total precipitation (BIAS) with less RMSE at both time scales. Binary analyses of 6-h simulated streamflows show that as streamflow magnitude increases, satellite-based simulations commonly demonstrate less skill to detect the extreme streamflows. This is indicated by generally decreasing POD, increasing FAR, and increasing areal bias (BIASa for unadjusted precipitation input) at higher streamflows. While bias-adjusted precipitation input result in improved capabilities to capture the extreme streamflows, it may result in considerable under-estimation ( Fig. 7e and f). Analysis of streamflow magnitude suggests that during warm months (spring and summer) streamflows are overestimated (high BIAS and high RMSE) when satellite products with no bias-adjustment are introduced to the hydrologic model. Unadjusted precipitation inputs also tend to yield overestimation of peakflows. The observed overestimation of streamflow is consistent with the observed overestimation of precipitation and is found less severe for cases where bias-adjusted precipitation is introduced to the model. On the other hand, bias-adjusted precipitation inputs yield more skill to capture the true magnitude of streamflow with generally improved COR, RMSE and BIAS scores.
Overall and by recognizing that remotely sensed satellite precipitation data are subject to significant errors, the present study indicates that there exists the potential of using satellite data in streamflow forecasting. The basin-scale analyses presented here show that the bias-adjusted products (TMPA-V6 & PERSIANNadj) are more promising than their unadjusted counterparts as they yield streamflows more comparable to ground-reference observations. While it is concluded that bias-adjustment is a very important step in improving the applicability of satellite precipitation data for streamflow simulations, even the bias-adjusted precipitation products are still imperfect and may result in poor detection and estimation of precipitation extent and intensity, particularly during extreme events. Also the existing bias-adjusted products cannot be employed in near real-time applications because of their bias-adjustment scheme (usually monthly bias adjustment). Therefore, along with efforts to develop better precipitation estimation techniques, robust near real-time bias-adjustment methods need to be investigated.
Institute of Technology, under a contract with the National Aeronautics and Space Administration. where N is the total number of observed and estimated rain pairs. (b) Categorical statistics using the contingency table. By constructing the contingency table using a specific precipitation threshold (Fig. A) where H is the hits, F the false alarm, M the misses, Z is the correct negative.