Developing Intensity‐Duration‐Frequency (IDF) Curves From Satellite‐Based Precipitation: Methodology and Evaluation

Given the continuous advancement in the retrieval of precipitation from satellites, it is important to develop methods that incorporate satellite‐based precipitation data sets in the design and planning of infrastructure. This is because in many regions around the world, in situ rainfall observations are sparse and have insufficient record length. A handful of studies examined the use of satellite‐based precipitation to develop intensity‐duration‐frequency (IDF) curves; however, they have mostly focused on small spatial domains and relied on combining satellite‐based with ground‐based precipitation data sets. In this study, we explore this issue by providing a methodological framework with the potential to be applied in ungauged regions. This framework is based on accounting for the characteristics of satellite‐based precipitation products, namely, adjustment of bias and transformation of areal to point rainfall. The latter method is based on previous studies on the reverse transformation (point to areal) commonly used to obtain catchment‐scale IDF curves. The paper proceeds by applying this framework to develop IDF curves over the contiguous United States (CONUS); the data set used is Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks – Climate Data Record (PERSIANN‐CDR). IDFs are then evaluated against National Oceanic and Atmospheric Administration (NOAA) Atlas 14 to provide a quantitative estimate of their accuracy. Results show that median errors are in the range of (17–22%), (6–12%), and (3–8%) for one‐day, two‐day and three‐day IDFs, respectively, and return periods in the range (2–100) years. Furthermore, a considerable percentage of satellite‐based IDFs lie within the confidence interval of NOAA Atlas 14.


Introduction
Engineering design of infrastructure requires information about runoff magnitudes for which the structures will be designed to withstand during their lifetime. In order to estimate these magnitudes, intensity (depth) duration frequency-IDF (DDF)-curves are the typical input to hydrological models used by hydrologists and civil engineers for design purposes. They represent a mathematical relationship between frequency, duration, and intensity (depth) of rainfall events. Their accuracy is contingent upon input data quality and statistical inference methods. The concept of the IDF dates back to the efforts of Bernard (1932) and since then many studies have focused on improving statistical inference methods used in IDF development. Most notable are the studies of Hosking and Wallis (1997) of introducing L-moments estimation (see also Hosking, regionalization methods such as the Index Flood method (Dalrymple, 1960;Hosking & Wallis, 1993;Wallis, 1982). Today, atlases of IDF curves have already been developed for several developed countries; an example of such efforts is National Oceanic and Atmospheric Administration (NOAA) Atlas 14 developed by the National Weather Service (NWS) at National Oceanic and Atmospheric Administration (NOAA; Bonnin et al., 2006Bonnin et al., , 2011Perica et al., 2011Perica et al., , 2013aPerica et al., , 2013b, which succeeded NOAA Atlas 2 developed in 1973. Despite the aforementioned methodological advancements in IDF formulation, construction of IDF curves for most countries around the world remains a major challenge. This is mainly because of the limited availability of long rainfall records with adequate spatial distribution to reflect temporal variation and spatial heterogeneity of precipitation. As has been stated earlier, the accuracy of IDF curves is dependent on both input data quality and statistical inference methods. While considerable research focus has been given to the latter, only a handful of studies examined the former. Some of these studies investigated the use of alternative sources of rainfall measurements such as radar (Eldardiry et al., 2015;Marra et al., 2017;Marra & Morin, 2015;Overeem et al., 2008), satellite-based precipitation, or downscaled global climate model's simulations of precipitation (DeGaetano & Castellano, 2017). Regarding the use of satellite-based precipitation, Endreny and Imbeah (2009) utilized Tropical Rainfall Measuring Mission (TRMM) rainfall data set in combination with rainfall data from ground gauges to construct IDF curves over Ghana. Similarly, Awadallah et al. (2011) investigated the use of TRMM and ground-based rainfall data to develop IDF curves over a region in Northwestern Angola. Recently, Gado et al. (2017) used the PERSIANN-CDR data set to develop IDF curves in ungauged sites by combining ground gauge data from neighboring sites in two basins in Colorado and California. Meanwhile, Marra et al. (2017) used Climate Prediction Center morphing (CMORPH) data to develop IDF curves over the eastern Mediterranean and compared them to IDF derived from radar data. Overall, these studies highlighted the benefit of using satellite-retrieved precipitation as an alternative source, particularly in partially gauged sites. However, several reasons limit the adequacy of these studies and the extension of their application to other regions. First, the methods used in most of these studies strongly rely on the partial availability of rainfall data sets with sufficient record length from ground gauges in the site of interest or in their neighboring sites, which is not satisfied in many regions. Second, they approached the use of satellite-based precipitation in IDF development from a case-study perspective and focused on small-scale regions; therefore, it is uncertain whether these methods will provide adequate results in regions with different climatic and precipitation regimes. Finally, and most importantly, the results of these studies were either not evaluated or evaluated in a small-scale domain. In other words, it is unknown whether these results underestimate or overestimate IDF curves that would ideally be derived from a dense network of rainfall gauges data.
In light of the aforementioned issues associated with previous studies on the use of satellite-based precipitation to develop IDF curves, the overarching goal of this article is to provide a methodological framework for developing IDF curves from satellite-based precipitation. This is achieved by, first, considering and analyzing the systematic error component (i.e., bias) in extreme satellite-retrieved precipitation; second, considering the necessary transformation of satellite-based precipitation from an area averaged to point rainfall; and finally, the application of commonly used regionalization methods to derive IDF curves. The area-to-point transformation implied in this framework is based on previous research studies that focused on the reverse transformation (i.e., point-to-area). The paper proceeds by applying this framework to develop IDF curves of durations one, two, and three days over the contiguous United States (CONUS). While this research was motivated by the potential of using satellite-based precipitation to develop IDF curves in data-scarce regions, CONUS has been chosen as a test bed because of the availability of rigorous IDF estimates from ground gauges provided by NOAA Atlas 14. Therefore, IDF curves derived from satellite-based precipitation are evaluated and compared to NOAA Atlas 14 to assess the performance of the framework.
The subsequent sections of this article are organized as follows. Section 2 presents the data sets used in this study as well as describing the case study region and its geographic sections. In section 3, a detailed description of the methodological framework for developing IDF curves is provided. This section includes the analysis of systematic error in extreme satellite-based precipitation, a model for bias adjustment of extreme satellite-based precipitation, and a method to transform areal rainfall to point rainfall. Section 4 provides the results and their evaluation as well as an analysis of their uncertainty. The paper concludes in section 5 by highlighting the main findings and discussing issues that need to be explored thoroughly and that might be the focus of future research studies.
Although in this study IDF curves have been solely derived from PERSIANN-CDR, two secondary data sets were used. First, Climate Prediction Center (CPC) Unified Gauge-Based Analysis of Daily Precipitation over CONUS, hereafter referred to as CPC, has been used to estimate parameters of the bias adjustment model. Second, the NOAA Atlas 14 data set has been used as a basis for the evaluation of IDF curves derived from satellitebased precipitation.

CPC Unified Gauge-Based Analysis of Daily Precipitation Over CONUS
The CPC data set was developed by NOAA's CPC. It covers the period (1948 to present) and has a similar spatial resolution to PERSIANN-CDR (0.25°× 0.25°). However, in this study, only CPC record in the period (1983 to present; i.e., same time coverage as PERSIANN-CDR) has been used. CPC data set was produced from a dense gauge network over the CONUS with approximately 8,500 stations and a mean station-to-station distance of 30 km (Chen et al., 2008). The interpolation algorithm used to develop the products is the Optimal Interpolation (OI) method (Gandin, 1965); this method proved to be reliable and provides results with high correlation in several studies (Chen et al., 2008; see also Bussieres & Hogg, 1989;Creutin & Obled, 1982).

NOAA Atlas 14
The NOAA Atlas 14 data set was developed by NOAA's National Weather Service (NWS; Bonnin et al., 2006Bonnin et al., , 2011Perica et al., 2011Perica et al., , 2013aPerica et al., , 2013b, and it is not yet available for the states of Texas, Oregon, Washington, Idaho, Montana, and Wyoming. NOAA Atlas 14 over CONUS is divided into five geographic regions as shown in Figure 1; these geographic regions have been adopted in this study for evaluation purposes since they represent, to some extent, regions with distinct climatic and precipitation regimes. NOAA Atlas 14 is derived from a dense network of rainfall gauges with an average record length range of (54-68) years (Bonnin et al., 2006(Bonnin et al., , 2011Perica et al., 2011Perica et al., , 2013aPerica et al., , 2013b.

Bias in Satellite-Based Extreme Precipitation
In recent decades, a multitude of studies have been devoted to the evaluation of satellite-retrieved precipitation (e.g., AghaKouchak et al., 2011;Behrangi et al., 2011;Dinko et al., 2008;Ebert et al., 2007;Sorooshian et al., 2000). While these evaluation studies differ from each other in many aspects, such as geographic location over which the evaluation is performed, temporal scale (e.g., daily, monthly), and evaluation metrics, the consensus is that satellite-based precipitation exhibits errors, both random and systematic. Moreover, satellite-based precipitation products have lower skill in detecting heavy rainfall (Mehran & AghaKouchak, . Therefore, it is necessary to examine errors in satellite-based precipitation prior to their use in IDF development. As far as IDF studies are concerned, only extreme rainfall events, defined as events higher than the 99th percentile of the distribution of rainfall totals accumulated over a specific duration, are of importance. This is because both approaches commonly used to sample extreme events, namely, Annual Maximum Series (AMS) and Partial Duration Series (PDS), contain rainfall values that are typically higher than the 99th percentile. Hence, in this study, analysis of errors in PERSIANN-CDR is carried out as follows. First, AMS is extracted from both ground-based precipitation (CPC) and satellite-based precipitation (PERSIANN-CDR) data sets for each grid (0.25°× 0.25°); the AMS length is 33 years, and it is extracted from the period of hydrological years . Second, an adjustment factor (ζ ) is defined as the ratio of ground-based (CPC) to satellite-based precipitation (PERSIANN-CDR), that is: where ζ (x, y, k) is the adjustment factor for the kth event in the AMS at location (x,y), R G (x, y, k) is the kth ground-based rainfall event in the AMS at location (x,y), and R S (x, y, k) is the kth satellite-based rainfall event in the AMS at location (x,y). Next, at each grid location the average adjustment factor ζ x;y ð Þ of values in equation (1) is calculated; this factor represents the systematic error (i.e., bias) in extreme satellite-based precipitation. Figure 2 shows the relationship between elevation and ζ x;y ð Þ . It can be clearly seen that the bias is significantly correlated with elevation as indicated by Pearson's correlation coefficient value of 0.54. This indicates that 29% (0.54 2 ) of the variability in the bias can be explained linearly by elevation. Hence, it can be concluded that, in general, satellite-based precipitation (PERSAINN-CDR) tends to have higher bias, particularly underestimation bias, in high-altitude regions. The presence of this relationship in PERSIANN-CDR as well as other satellite-based precipitation products has been observed in previous studies (e.g., Hashemi et al., 2017;Miao et al., 2015). This is due to the fact that warm orographic rainfall over high-altitude regions poses a challenge to satellite-based precipitation retrieval algorithms based on IR imagery (Dinko et al., 2008).

Bias Adjustment Model
Based on the previous analysis, the following model is proposed as a new approach to adjust extreme satellite-retrieved precipitation; the model utilizes elevation as the only explanatory variable.
where E is elevation in meters, α and β are parameters, and ζ x;y ð Þ is defined as before. Figure 2 (blue curve) shows the estimated adjustment factors based on the model for one-day annual maximum series. Estimation of the model parameters can be carried out in a simple manner by recognizing that the model can be solved analytically by linearization. This can be performed by taking the natural logarithm of both sides in equation (2), then solving for the values of the parameters ln(α) and β using ordinary least squares solution. It should be noted that the parameters α and β are estimated for each duration of interest (i.e., one day, two days, and three days) separately. Table 1 lists the values of parameters and their 95% confidence intervals for each duration of interest.

Transformation of Areal Rainfall to Point Rainfall
An important issue to be considered when developing IDF curves from satellite-based precipitation is the areal nature of the data, since all  (1) for annual maximum series of one day. The red dots represent observations, while the blue line represents the adjustment model calculated from equation (2) with the values of parameters given in Table 1.

Water Resources Research
products of satellite-based precipitation estimate the average precipitation depth over a grid area, which is in the case of PERSIANN-CDR is (25 km × 25 km = 625 km 2 ). Areal rainfall distribution has both a lower mean and variance compared to the distribution of point rainfall; this follows directly from the fact that the former is an averaged random process of the latter. It is widely stated in the literature that in general the difference between the areal and point distributions increases with decrease in the total rainfall depth (Eagleson, 1970;Rodriguez-Iturbe & Mejía, 1974a). This is mainly because events that produce low amounts of rainfall tend to be more localized. This relationship is significantly present in satellite-based precipitation as shown in Figure 3. It can be clearly seen that high quantiles of rainfall depths correspond to low quantiles of bias (i.e., systematic error) with a Pearson's correlation coefficient value of À0.38.
Considerable research attention has been assigned to the development of methods that transform point IDF curves to areal IDF curves; such a transformation requires reduction factors commonly known as areal reduction factors (ARFs). Methods of developing ARFs fall into two categories: first, empirical methods which utilize rainfall time series data from gauge network in a specific region to develop relationships between point and area-averaged rainfall (e.g., U.S. Weather Bureau, 1957Bureau, , 1958 and second, theorybased methods which are based on the stochastic representation of rainfall fields in space and time. In this study, we adopt a theory-based approach to derive ARFs, which was proposed by Sivapalan and Blöschl (1998) and is based on the spatial correlation structure of the rainfall field. It should be noted that contrary to the common use of ARFs, we are interested in transforming areal to point rainfall; thus, we will use the reciprocals of ARFs. The methodology consists of first assuming an isotropic correlogram (i.e., spatial correlation structure) of point rainfall of the following exponential form: where ρ is correlation, r is the Euclidean distance between two points, and λ is a parameter that specifies the decay in correlation. To estimate the parameter λ, equation (3) has to be fitted to preserve the mean observed correlation at a distance known as the characteristic distance r A (Rodriguez-Iturbe & Mejía, 1974a, 1974b; this distance is a function of the shape and size of the area under consideration. The characteristic distance (r A ) is defined as the mean distance between two randomly chosen points in the region of interest, and its distribution was provided by Ghosh (1951). Matérn (1986) used the distribution to compute the ratio of the characteristic to the maximum distances for unit areas with standard shapes (e.g., square and circle). The following result was found for a square unit area (A): Applying this result on the grids of PERSIANN-CDR (25 km × 25 km) will result in a characteristic distance of 26.07 km. However, because the distances between the grid centers for which equation (3) can be computed can only take multiples of 25 km (i.e., 25, 50, and 75), we have taken r A to be 25 km. Then, the average observed cross correlation between the annual maximum series at distances of 25 km was calculated for each of the geographic sections shown in Figure 1. Finally, equation (3) is fitted to the values of observed correlations to estimate the value of parameter λ.
Additionally, in order to evaluate the bias resulted from assigning a value of 25 km instead of 26.07 km to the characteristic distance, the sensitivity of the parameter λ to changes in the characteristic distance have been investigated. The results (see Table 2) demonstrate that the sensitivity is different in each geographic region depending on the precipitation mechanism. However, the average sensitivity is in the order of 7.5% for a

Water Resources Research
change of 25 km in the characteristic distance and it increases consistently with more significant changes in the characteristic distance. Therefore, the bias in the parameter λ resulted from assigning a value of 25 km instead of 26.07 km is on average significantly less than 7.5%.
After estimating the parameter λ, the variance reduction factor κ 2 , defined as the expectation of the correlation between any two random points within the region under consideration, can be calculated according to the following equation (Rodriguez-Iturbe & Mejía, 1974a;Sivapalan & Blöschl, 1998): Furthermore, Rodriguez-Iturbe and Mejía (1974a) showed that equation (5) can be simplified by integrating the product of the probability density function of variable r and the correlation function according to the following equation: where ρ and r is defined as above and f R (r) is the probability density function of the random variable r. For a square area with side length a (e.g., in the case of PERSIANN-CDR, a = 25 km), Ghosh (1951) has derived the distribution of r (i.e., f R (r)).
The final step is to use the variance reduction factor estimated from equation (6) to adjust the parameters of the generalized extreme value (GEV) probability distribution that will be fitted to the data according to equations (8) and (9). These equations have been derived by Sivapalan and Blöschl (1998) by matching the parameters of areal and point extreme rainfall distributions in the particular case of zero area. See Sivapalan and Blöschl (1998) for detailed derivation of equations (8) and (9).
where μ p and μ A are the point and areal GEV distribution location parameters, respectively; similarly α p and α A are the point and areal scale parameters, respectively.
This theory-based approach to derive area-to-point transformation factors has been validated in Sivapalan and Blöschl (1998). The validation was performed by comparing the ARFs derived by this method to ARFs observed in actual storms. In this study, we investigated the validity of the methodology by examining an extreme rainfall event over Texas on 27 August 2018 associated with hurricane Harvey. Total 24-hr rainfall was obtained from NCEP Stage IV multisensor (i.e., radar and gauges) precipitation data, then the observed ARFs were calculated (red line with markers in Figure 4). Next, using one-day IDF estimates for that region reported in Cleveland et al. (2015), the observed ARFs were matched through the selection of appropriate correlation length λ. Further details of the validation process such as the equations used for the selection of appropriate correlation length are provided in Text S1. Figure 4 demonstrates that the two enveloping curves for the observed ARFs correspond to correlation lengths of 120 and 160 km. Since the correlation length reflects information about the rainfall generating mechanism, these large values of correlation length are consistent with the large synoptic-scale event that produced this storm. It should be noted that this is an approximate validation since the observed ARFs are storm-centered (i.e., specific to storm); meanwhile, the simulated ARFs are fixed-area ARFs.

Developing IDF Curves
After adjusting the bias in the annual maximum series extracted from PERSIANN-CDR using the model described in section 3.2, the process of developing IDF curves is carried out in several steps illustrated in Figure 5. First, regionalization is applied to improve the statistical inference by increasing the number of samples. This is achieved by creating homogenous regions using the k-means algorithm to cluster grids. This step starts with input data to the algorithm that constitute latitude, longitude, elevation, and mean annual precipitation; these data to a certain extent define different climatic divisions. Next, the output clusters from the k-means algorithm are tested statistically for homogeneity using the method described in Hosking and Wallis (1993). In this method the within-cluster variation in L-CV (i.e., the ratio of second to first L-moments) is compared with what would be expected by simulations from a general probability

Water Resources Research
distribution; in this study the Wakeby distribution (Houghton, 1978; see also Hosking & Wallis, 1997) was used. If clusters are not satisfactory according to the homogeneity check, clustering is repeated with increasing the number of groups. It should be noted that clustering might be different for each duration of interest (e.g., one day and two days) since it depends on L-CV values of each AMS.
Following the identification of homogenous regions, the AMS at each grid is normalized by dividing it by the mean AMS value. Then, the AMSs in each homogenous region are combined and fitted to a GEV distribution. The choice of the GEV distribution to model the extreme rainfall process was validated using the Kolmogorov-Smirnov test (Massey, 1951); results showed that GEV is an adequate distribution to represent the annual maximum series. The location and scale parameters of the distribution are then adjusted to account for the transformation of areal to point rainfall using the approach described in section 3.3.
Finally, precipitation quantiles corresponding to return periods (2, 5, 10, 25, 50, and 100) years are calculated using the index flood procedure (Hosking & Wallis, 1997). In this approach, the quantiles for each homogenous region, also known as the regional growth factors, are estimated. Next, to account for normalization, the quantiles in each grid cell are calculated by multiplying the mean AMS value at the cell by the growth factor according to the following equation: where q (x, y) is the quantile at grid (x,y), μ (x, y) is the mean of AMS at grid (x,y), and b q is the regional growth factor for the homogenous region of interest.

Estimation of Confidence Intervals
Confidence intervals are estimated using Monte Carlo bootstrapping; the method consists of three steps. First, the at-site empirical cumulative distribution function is estimated at each grid cell center using Kernel density estimation (Parzen, 1962;Rosenblatt, 1956). Second, samples of AMS are extracted from the empirical distribution with the same length of record as the original AMS. The sampling is performed by drawing a uniform random variable in the range (0,1), then the empirical cumulative distribution function is used to estimate the corresponding quantile. It should be noted that Monte Carlo sampling is implemented 1,000 times to approximate the asymptotic properties of the population distribution. In the final step, the quantiles are estimated using the method described in section 3.4, then the 5th and 95th percentiles are computed from the data to obtain the 90% confidence interval. Figure 6 illustrates the impact of the bias adjustment at a high-altitude location (a) and a low-altitude location (b). Clearly, the results suggest the following: First, PERSIANN-CDR before adjustment and CPC (red dots) follow an identical distribution since the quantiles lie almost perfectly on a straight line. Second, the bias in the case of high-altitude regions (Figure 6a) is more significant than the bias in low-altitude regions (Figure 6b). This provides further demonstration to the analysis presented in section 3.1 about the significant correlation between elevation and bias. Finally, the bias adjustment model removes a sizable portion of the systematic error as can be seen from the close alignment of the quantiles after

10.1029/2018WR022929
Water Resources Research adjustment (blue dots) with the 45°line (gray dotted line). However, it should be noted that the remaining bias, illustrated by the blue dots falling below the 45°line, will be accounted for by the areal to point transformation. Furthermore, it can be discerned from Figure 6 that the bias adjustment results in an overestimation for the largest event in the AMS. This is primarily because the average bias adjustment factor estimated for all values in the AMS is higher than the actual bias in the highest AMS value; this result is consistent with the analysis shown in Figure 3.
Although the bias adjustment model presented in this study is effective in removing bias, it can be seen from Figure 2 that for a given elevation, there is a range of values for the adjustment factor. In other words, the elevation is not a satisfactory and/or sufficient explanatory variable in some locations. Further investigation of the model's performance was conducted over CONUS (see Figure S1). The results demonstrate that the multiplicative bias in the adjusted AMS from PERSIANN-CDR is considerable over the California Central Valley, northern parts of California, Oregon, and Washington. In particular, the bias over these regions is mostly an overestimation bias. This analysis highlights that while the model is effective in removing bias over most regions in CONUS, it has limitations regarding the adjustment of overestimation bias. Figure 7a shows the contribution of area-to-point transformation in reducing the relative error of IDF estimates compared to that of the bias correction. Clearly, the bias adjustment is the prime factor in improving IDF estimates; however, areal-to-point transformation plays a considerable role in reducing the relative error of IDF estimates. A decreasing trend for the contribution of area-to-point transformation as the duration of IDF increases can be discerned from Figure 7a. Further evidence to support this conclusion is demonstrated in Figure 7b which shows the relationship between the transformation factor, duration, and return period. The inverse relationship of the transformation factor and duration is consistent with previous studies (e.g., Asquith & Famiglietti, 2000;Mineo et al., 2018), and it is justified by rainfall behavior since short-duration events are primarily associated with small areal extent and convective rainfall; meanwhile, long-duration events are distributed over a large area (Mineo et al., 2018;Sivapalan & Blöschl, 1998). On the other hand, the transformation factors increase with return period as shown in Figure 7b. This relationship shows that the transformation method is not independent of return period and it is consistent with previous studies (e.g., Veneziano & Langousis, 2005). This is because the transformation is applied to both the location and scale parameters of the distribution. Sivapalan and Blöschl (1998) showed that this transformation method results in a decrease of the coefficient of variation as the area increases unlike transformation methods that assume independence of return period resulting in a constant coefficient of variation.

IDF Curves Evaluation
IDF curves derived from PERSIANN-CDR are evaluated against NOAA-Atlas 14 precipitation frequency estimates. The evaluation is performed over the CONUS except the states of Washington, Oregon, Idaho, Montana, Wyoming, and Texas because of unavailability of NOAA-Atlas 14 estimates in these states as shown in Figure 1. The evaluation is performed for IDF with durations one, two, and three days and return periods 2, 5, 10, 25, 50, and 100 years.
The main metric used for evaluation of IDF estimates derived from PERSIANN-CDR is the percentage relative error which is defined as follows:

Water Resources Research
This is an adequate performance metric since it is normalized and therefore not sensitive to the absolute values of rainfall. This allows us to examine the performance of IDF estimates over the whole spatial domain regardless of variations in climate. Figure 8 shows the relative error of IDF curves over the whole spatial domain of NOAA Atlas 14 (see Figure 1) for durations one, two, and three days and return periods 2, 5, 10, 25, 50, and 100 years. While the errors are considerable for one-day duration with the median errors in the range (À17% to À22%) for return periods (2-100 years), the errors are less significant in longer durations. For example, in the case of two-day IDF, the median errors range is (À6% to À12%); meanwhile, for three-day IDF, the median errors range is (À3% to À8%). This trend of improved performance with longer durations is due to the increased accuracy of satellite-based precipitation over long time scales as well as the temporal mismatch comparing remotely sensed and gauged rainfall over short periods. It should also be noted that the errors are more pronounced in high return periods, and this is attributed to the relatively short record of PERSIANN-CDR (~30 years) compared to the length of record used to derive NOAA Atlas 14 which on average ranges from 54 to 68 years (Bonnin et al., 2006(Bonnin et al., , 2011Perica et al., 2011Perica et al., , 2013aPerica et al., , 2013b. Overall, IDF relationships derived from PERSIANN-CDR tend to underestimate the amount of precipitation; however, the errors are not significant in durations of two days and larger. Since the previous analysis only reveals information about the aggregate performance over the whole NOAA Atlas 14 spatial domain, it is important to examine the accuracy of IDF curves over different geographic regions. Therefore, IDF curves have been evaluated separately over each of the geographic sections shown in Figure 1. While the average relative errors over all geographic regions are comparable and do not indicate large differences as shown in Figure 9a, the percentages of IDF curves that lie within the confidence interval of NOAA Atlas 14 clearly highlight that the accuracy of IDF curves derived from PERSIANN-CDR varies significantly. As can be seen from Figure 9b, the accuracy is higher over the Northeastern States since 77, 86, and 84% of one-day, two-day, and three-day IDF curves, respectively, lie within the 90% confidence interval. It is followed by the Southeastern States where approximately 43, 79, and 86% of one-day, two-day, and three-day IDF curves lie within the confidence interval. The poorest accuracy is observed over the Southwestern States where only 20% of one-day IDF curves lie within the confidence interval. This is primarily due to the inadequacy of the bias adjustment model over the Central Valley of California (see Figure S1).
In order to understand the sources of observed errors in PERSIANN-CDR IDF curves, we compare IDF curves from the original PERSIANN-CDR (i.e., without adjustment and area-to-point transformation) and from the CPC record ; the results are shown in Figure 10. By comparing IDF curves derived from CPC (black dotted lines) and NOAA Atlas 14 (black lines), it can be clearly seen that IDFs from CPC exhibit underestimation errors. Since the data used to derive NOAA Atlas 14 is identical to CPC data, the observed differences are primarily due to the length of record as we have used CPC record of approximately 30 years long. This highlights that while the observed errors can potentially be attributed to several sources such as the difference in spatial scale, it is important to consider the relatively short length of record as the main source of underestimation.
An important point to be concluded from Figure 10 is that the bias adjustment and the area-to-point transformation are important, and they improve the results significantly. This can be clearly seen by comparing the original PERSIANN-CDR IDFs (red dotted line) and IDFs derived after adjustment and transformation (red lines). For example, in both Figures 10a and 10b, IDFs derived from adjusted PERSIANN-CDR lie within the confidence interval of NOAA Atlas 14; meanwhile, IDFs before adjustment are considerably underestimated and lie out of the confidence interval. Figure 10c shows an example of an IDF where bias adjustment and area to point transformation improve the results yet not sufficiently as the obtained IDF (red line) lies outside the confidence interval. On the other hand, it can be seen from Figure 10d that IDF curves from the original PERSIANN CDR (red dotted line), that is, without adjustment and area-to-point transformation, are overestimating. Therefore, the framework used in this study to adjust the bias and account for the areal nature of satellite-retrieved precipitation exacerbates the errors leading to an increased overestimation as shown by the red line in Figure 10d. This highlights that while the adjustments embedded in the methodology are essential for the development of accurate IDF curves, special attention should be paid in regions where satellite-based precipitation products show peculiar performance such as the case over the Central Valley of California.

Uncertainty and Impact of Regionalization
The issue of uncertainty in satellite-based IDF curves is more difficult than when ground measurements are used to develop these curves. This is because there are several components of uncertainty to be considered. First, uncertainty arises from the estimation process since satellites do not measure precipitation directly but rather utilize other information as a proxy for rainfall rate. Second, there are uncertainties induced by the methodological framework proposed in this study; these include the bias adjustment model and the transformation from areal to point rainfall. Finally, the commonly considered source of uncertainty is estimation of distribution parameters.
In this section, we only discuss uncertainty that arises from the estimation of distribution parameters. Confidence intervals of IDF estimates are computed using Monte Carlo bootstrapping described in

10.1029/2018WR022929
Water Resources Research section 3.5. First, we highlight the importance of regionalization in constraining the uncertainty to narrower limits. Figure 11a shows the coefficient of variation for the distribution of quantiles corresponding to 2, 5, 10, 25, 50, and 100 years for both cases of using regionalization and at-site (i.e., no regionalization) estimation. The distribution is obtained by extracting 1,000 samples, then estimating the quantiles. Meanwhile, the coefficient of variation (i.e., the ratio of the standard deviation to the mean) is used to assess uncertainty since it is a normalized measure and hence allows us to examine all regions regardless of variation in their climate. It can be seen that for lower quantiles such as those corresponding to 2and 5-years return period, the impact of regionalization is barely noticeable. However, as higher quantiles are considered, the differences in the coefficient of variation are more significant with regionalization leading to lower coefficients of variation. This indicates the importance of regionalization in reducing the uncertainty, particularly for quantiles in the tail of the distribution.
Furthermore, regionalization in the case of satellite-based precipitation is more effective in reducing uncertainty since the amount of available data is immense. To illustrate, an arbitrary homogenous area of 6,250 km 2 that might be covered with three ground gauges with average record length of 50 years will generate (3 * 50 = 150 samples), while on the other hand, PERSIANN-CDR will provide (10 grids * 30 = 300 samples). The increased sample size will result in a decrease of the uncertainty range. As it can be seen from Figure 11b, uncertainty ranges in the case of IDFs derived from PERSIANN-CDR are smaller than those of NOAA Atlas 14. This highlights that implementing regionalization in the case of satellite-based precipitation is more effective.

Conclusions
The goal of this paper is to contribute and advocate for the development of methods that facilitate the use of satellite-retrieved precipitation in developing IDF curves. This is of particular interest to developing countries where existing networks of ground gauges do not provide sufficient spatial coverage or record length to develop IDF curves. Given the continuous advancement in remote sensing and the retrieval of precipitation from satellites, it is worthy of attention to dedicate more research efforts toward the development of methods that ensure the incorporation of satellite-based precipitation in the design, operation, and planning of infrastructure. This study has attempted to examine this issue from a methodological point of view by considering and accounting for the characteristics of satellite-based precipitation. The methodology used in this study is different from previously reported studies on the use of satellite-based precipitation in the development of IDF curves, which approached this issue from a case study perspective. The ultimate aim of this study is to contribute in the development of general methodologies that can provide adequate results in the absence of in situ rainfall measurements.
While the main motivation for this research is the potential use of satellite-based precipitation to construct IDF curves for developing countries, it is important at this early stage of methodological research to examine the methods by evaluating them in regions with extensive networks of ground gauges with sufficient length of record. This has been the rationale behind the selection of CONUS as a test bed to evaluate the proposed methodology. It is important to emphasize that the methods proposed in this study are neither tailored to a specific region nor to a specific satellite-based precipitation product. Furthermore, we emphasize that estimating the adjustment model parameters is the only step in the proposed framework that requires the availability of ground-based measurements. The question then arises, "Are these parameter estimates sufficiently robust such that they can be applied in other regions?". The answer is twofold. First, it is expected that these estimates are robust over most regions since data from all grids over CONUS, which represent a variety of

Water Resources Research
climatic and precipitation regimes, have been used in the estimation process. Second, as has been shown in this study, the model has limitations in adjusting overestimation bias over specific locations such as the Central Valley of California. However, further studies are sorely needed to explore the bias in PERSIANN-CDR as well as other satellite-based precipitation products over different regions. It is also important to note that in regions with partial coverage of ground rainfall gauges, information from ground gauges may be incorporated to validate the bias adjustment model presented in this study, which will lead to improved performance. We also acknowledge that the bias in other satellite-based precipitation products does not necessarily follow the same characteristics observed in PERSIANN-CDR. For example, Endreny and Imbeah (2009) reported that bias in TRMM rainfall depths over Ghana is primarily overestimation bias; meanwhile, in this study, PERSIANN-CDR mainly exhibits underestimation bias. Thus, an extensive analysis of bias is required in other satellite-based products prior to their use in IDF development.
Overall, the results of this study highlight the potential of using satellite-based precipitation as an alternative source to the commonly used ground-based measurements in developing IDF curves. Through comparison with NOAA Atlas 14 estimates, which have been used as a benchmark, we found that the median relative errors in satellite-based IDFs over CONUS are in the range of (À17 to À22%), (À6 to À12%), and (À3 to À8%) for one-day, two-day, and three-day IDFs, respectively. Furthermore, a significant percentage of satellite-based IDF curves fall within the confidence interval of NOAA Atlas 14 for most geographic sections of CONUS with the best results over the Northeastern States with 77, 86, and 84% of one-day, two-day, and three-day IDFs within the confidence interval. These promising results corroborate findings reported in Gado et al. (2017), which demonstrated that the use of satellite-based data with bias adjustment from local gauges provides accurate quantile estimates. The increase in IDF error with increase in the return period can be attributed to uncertainty associated with the short length of record; this relationship is consistent with uncertainty analysis of IDF derived from remotely sensed observations (Marra et al., 2017). We also highlight that IDFs derived from PERSIANN-CDR in this study over the Central Valley of California exhibit higher errors since the original product is overestimating in this region. This emphasizes the importance of considering any peculiar performance of satellite-based precipitation over specific regions prior to the development of IDF curves. It also pinpoints that elevation is not a satisfactory and/or sufficient explanatory variable in some locations to adjust bias in extreme satellite-based precipitation.
Finally, there are several important questions regarding the use of satellite-based precipitation in IDF development that remain unanswered and in need of further investigation: first, quantifying the different sources of uncertainty in satellite-based IDFs that arise from the estimation of rainfall rates, bias adjustment, transformation of areal to point rainfall and the estimation of distribution parameters. In this study, we only dealt with the uncertainty in the estimation of distribution parameters; however, other sources of uncertainty should not be ignored. A possible approach to deal with uncertainty from the estimation process is to consider several satellite-based precipitation products in an ensemble approach, which will provide uncertainty limits for the random error component. With regard to bias adjustment, it might be beneficial to estimate the parameters of the adjustment model using Bayesian regression to provide uncertainty bounds to the parameter estimates. Second, further research is needed to investigate the impact of the liquid/frozen precipitation partitioning since satellite-based precipitation provides estimates of the total precipitation while in the development of IDF curves usually only liquid precipitation is considered. This might only be of significance in regions that receive considerable amounts of frozen forms of precipitation (i.e., snow, ice, and hail) during extreme precipitation events. Finally, as the results of this study have shown that regionalization of IDFs derived from satellite-based estimates is more effective in reducing the uncertainty in distribution parameters due to the availability of more information, it is important to develop regionalization methods that can exploit the information content of satellite-based precipitation data sets more efficiently.