Bias adjustment of infrared-based rainfall estimation using Passive Microwave satellite rainfall data

This study explores using Passive Microwave (PMW) rainfall estimation for spatial and temporal adjustment of Precipitation Estimation from Remotely Sensed Information using Arti ﬁ cial Neural Networks-Cloud Classi ﬁ cation System (PERSIANN-CCS). The PERSIANN-CCS algorithm collects information from infrared images to estimate rainfall. PERSIANN-CCS is one of the algorithms used in the Integrated Multisatellite Retrievals for GPM (Global Precipitation Mission) estimation for the time period PMW rainfall estimations are limited or not available. Continued improvement of PERSIANN-CCS will support Integrated Multisatellite Retrievals for GPM for current as well as retrospective estimations of global precipitation. This study takes advantage of the high spatial and temporal resolution of GEO-based PERSIANN-CCS estimation and the more effective, but lower sample frequency, PMW estimation. The Probability Matching Method (PMM) was used to adjust the rainfall distribution of GEO-based PERSIANN-CCS toward that of PMW rainfall estimation. The results show that a signi ﬁ cant improvement of global PERSIANN-CCS rainfall estimation is obtained.


Introduction
Better understanding of the spatial and temporal distribution and variability of precipitation is crucial for hydrological process modeling and the study of extreme hydrometeorological events such as floods and droughts. Rain gauges are the most popular method to measure precipitation; however, their sparse distribution of measurements over remote regions and mountain areas is a challenge. Although ground-based radars can provide high spatial and temporal resolution rainfall, they have their own limitations, especially in providing coverage over high terrain which blocks the radar beam. Satellite observations in various spectral bands provide continuous coverage, and through a variety of algorithms can produce rainfall estimates. Satellite observations are considered a viable alternative to ground-based precipitation measurements [Barret, 2001]. Geosynchronous Earth orbit (GEO) and Low Earth Orbit (LEO) satellites are the two most popular satellite systems for estimation of meteorological parameters. GEO satellites carry sensors with spectral wavelengths from visible (VIS) to longwave infrared (IR) which are relevant to albedo and cloud top temperature accordingly, rather than directly to surface rain. GEO satellite imagery has been frequently used for rainfall estimation due to their high spatial and temporal resolution [Wu et al., 1985;Griffith et al., 1978;Scofield, 1987;Adler and Negri, 1988;Vicente et al., 1998]. Although having lower spatial and temporal sampling frequency and resolution, Passive Microwave (PMW) from LEO satellites can penetrate inside clouds to measure thermal emissions which are attenuated by raindrops [Marzano et al., 2004;Tapiador, 2008;Behrangi et al., 2009]. Effectively integrating estimations from LEO-PMW and GEO-VIS/IR information provides an option for improving precipitation at high spatial and temporal scales. For example, Microwave Infrared Rainfall Algorithm [Todd et al., 2001] uses PMW and IR data for rainfall estimation assuming that PMW sensors have the more accurate estimates of instantaneous rainfall. The Self-Calibrating Multivariate Precipitation Retrieval (SCaMPR) algorithm estimates rainfall at a fine temporal resolution using PMW and GEO satellites. SCaMPR uses Special Microwave Sensor Imager (SSM/I) data to distinguish between rain/no-rain pixels and then uses Geostationary Operational Environmental Satellites (GOES) data to calibrate the relationship between Tb-RR via linear regression for the precipitating pixels [Kuligowski, 2002[Kuligowski, , 2009. Kidd et al. [2003] used the histogram matching technique between PMW rainfall data and IR cloud top temperature to estimate rainfall over Africa during a 4 month calibration period. The Climate Prediction Center morphing method (CMORPH) uses motion advection vectors from dynamic GEO-IR images to fill the temporal gaps between two available PMW rainfall estimates [Joyce et al., 2004]. Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation Analysis (TMPA) combined precipitation estimates from multiple satellites, as well as gauges where feasible to generate rainfall data with a 0.25 × 0.25°resolution every 3 h.
Building upon the success of TRMM, which operated from November 1997 to April of 2015, the Global Precipitation Measurement (GPM) satellite was launched in February 2014. GPM is a concept based on using the available LEO-PMW satellite data and adding GEO-IR-based precipitation data to provide global precipitation estimation at near real time.
One key data set from the NASA GPM program is Integrated Multisatellite Retrievals for GPM (IMERG). IMERG integrates both low Earth orbital (LEO) and geostationary Earth orbital (GEO) satellite information and surface precipitation gauge analysis to provide global precipitation [Huffman et al., 2015]. The satellite-based estimation of IMERG is a merged retrieval using algorithms developed from three groups, including (1) [Huffman et al., 2007;Joyce and Xie, 2011;Hong et al., 2004]. IMERG has been developed to provide better quality and shorter time latency for global precipitation monitoring at 0.1°× 0.1°and half-hourly samples. The time latency of the data products is around 4 h and 12 h from observation time for the "Early" and "Late" near real-time multisatellite products. The "final" satellite gauge-merged product is available around 2 months after observation time [Huffman et al., 2015].
The purpose of this paper is to report the improvements made to the version of PERSIANN-CCS currently used in IMERG. PERSIANN-CCS which uses only IR data to estimate precipitation indirectly from the cloud top temperature has some inherent uncertainties which result in overestimation or underestimation, depending on the time and geolocation. In this study we provide the results of combining precipitation data from various sources to improve the PERSIANN-CCS rainfall estimation. The paper further lays out the application of the Probability Matching Method to transfer PERSIANN-CCS rainfall toward the PMW rainfall distribution based on climatology data and testing the process over one full year. In section 2, we describe the data sets and location used in this study. Methodology and scenarios of combining different sensors are described in section 3. The results and findings are discussed in section 4 using statistical validation and visual analysis. At the end the conclusion of this study is presented in section 5.

PERSIANN-CCS Estimation
PERSIANN-CCS is an object-based algorithm that uses cloud coverage areas under defined IR temperature thresholds to estimate rainfall. The algorithm uses the longwave IR channel (10.7 μm) to classify cloud images based on the IR brightness temperature segmentation and then extract features for the cloud patches at three levels (220 K, 235 K, and 253 K). The rain estimation from PERSIANN-CCS is based on calibrating the cloud top temperature and rainfall (Tb-R) relationships for the classified cloud groups, initially using gaugecorrected radar hourly rainfall data from the Next Generation Weather Radar (NEXRAD) network when the algorithm was first developed over the CONUS. The current operational PERSIANN-CCS algorithm was trained using one full year of PMW rainfall for a near global 60°N to 60°S and 180°W to 180°E coverage, with separately trained 24 overlapping subareas to account for latitude and terrain variability over the globe.

PMW Precipitation Data
PMW precipitation estimation from LEO satellites has less frequently sampled data points per day and has relatively lower spatial and temporal resolutions than GEO satellites. The PMW precipitation data set (MWCOMB), obtained from the NOAA Climate Prediction Center (CPC), was used in this study. MWCOMB is a blended precipitation data from multiple sensors and orbits, such as DMSP SSM/I, NOAA AMSU-B, and TRMM Microwave Imager [Ferraro et al., 1994;Kummerow et al., 2001;Weng et al., 2003;Huffman et al., 2007]. MWCOMB data are processed to 8 km resolution from 60°N to 60°S and 180°E to 180°W and then regridded to 0.25 latitude-longitude for this study.
observations for rainfall estimation at 1 km resolution every 5 min. At this time, NSSL is developing quantitative precipitation estimation which produces a 3-D radar mosaic grid from the precipitation (Q3) over CONUS. In this study, Q2 estimation is processed to 0.25°× 0.25°resolution 0.5 hourly for PERSIANN-CCS rainfall evaluation after and before adjustment.

Methodology
Studies had shown that by using the precipitation data from multiple satellites the measurement can be improved spatially and temporally. This study explores the use of LEO satellite microwave precipitation data for recalibration of GEO-based satellite estimates from the PERSIANN-CCS algorithm. The Probability Matching Method (PMM) is used to match PERSIANN-CCS precipitation estimation toward LEO-based PMW estimation. PMM looks at the relationship between two variables based on their marginal distribution without considering their joint distribution [Ciach et al., 1997]. PMM was developed in 1987 [Calheiros and Zawadzki, 1986] and has since been extended to meteorological applications Piman et al. [2007]. Krajewski and Smith [1991] and Rosenfield and Amitai [1998] used PMM to improve the radar rainfall estimation using gauge measurements by assuming that the radar observed reflectivity has the same probability of occurrence as the gauge-measured rain intensity. In this study the rainfall distribution estimated from PERSIANN-CCS is processed to match LEO-PMW rainfall data for their concurrent samples. Figure 1 explains how the precipitation from PERSIANN-CCS can be adjusted with PMW rainfall estimates based on the cumulative distributed function (CDF) computed from the probability distribution function (PDF). Figure 1 on the left describes the PDFs for PMW (solid line) and PERSIANN-CCS (dashed line) calculated based on the concurrent samples of climatology data showing the probability of rainfall rate. Figure 1 on the right is the CDF function calculated based on the PDFs of PMW and PERSIANN-CCS data. This figure explains how the bias adjustment is done using PMM method. As the arrow shows on the right figure the rainfall rate from PERSIANN-CCS will be replaced by the PMW estimate for the same CDF value. Equations (1)-(3) explain how the PDF and CDF are calculated.
where R CCS represents for the precipitation from PERSIANN-CCS and R PMW is the precipitation from LEO PMW estimates; p(R CCS ) and p(R PMW ) are the probability density functions (PDFs) of R CCS and R PMW . The PDFs of R CCS and R PMW are calculated from concurrent samples of historical data assuming that precipitation estimates R PMW and R CCS follow the same distribution each year. PMM can also be estimated from PDFs calculated based on more recent data within a fixed time period window and dynamically adjusted over time when new data are available. Bias adjustment of PERSIANN-CCS can also be processed based on this approach of using both climatology and recent data adjustment. In this study, PMM based on climatology data only is presented. The concurrent data samples were collected each month separately for a 4 year period (2008)(2009)(2010)(2011). Precipitation estimation from PERSIANN-CCS is at a 0.04°× 0.04°spatial resolution every 0.5 h, while PMW precipitation from MWCOMB is at 0.25°× 0.25°. The cumulative distribution functions (CDFs) of both data sets are calculated from the concurrent samples at the same spatial resolution. Therefore, PERSIANN-CCS is regridded to 0.25°× 0.25°spatial resolution. Meanwhile, to cover the regional and seasonal variability, the PDFs are calculated for each 5°× 5°subarea of the global coverage and each month separately to include a sufficient number of samples for model calibration. A lookup table is used for rainfall adjustment for each month and subarea. Figure 2 explains the PMM details discussed above briefly in one chart.

Experiments and Results
For evaluating the results and seeing the pattern of adjustment over the globe we look at the CDFs of PMW and PERSIANN-CCS data over some selected 5°× 5°bins (the bins were used for calibration), based on the frequency of the climatology data. Figure 3 shows the data sample frequency for the 5°× 5°subareas, globally. The areas located between 30 N to 60 N do not have enough concurrent samples collected during winter time over land. PMW data exhibit high uncertainty when representing rain pixels over snow and ice surfaces, with flags for those pixels that are possible ice or cold ground during the cold seasons. Also, some areas in the Southern Hemisphere such as Australia have fewer concurrent samples compared with other regions in summer and winter due to some inconsistent GEO satellite coverage used to generate PERSIANN-CCS. Figure 4 shows the estimated CDFs based on the 4 years of data (climatology data used from 2008 to 2011) for PERSIANN-CCS and PMW rainfall at several selected 5°× 5°bins in January, (winter season in the Northern Hemisphere) and July (summer season in the Northern Hemisphere). The bins are selected at different zones describing the pattern of CDFs over different regions. Bins I and II in the Northern Hemisphere are showing that PERSIANN-CCS overestimates the rainfall significantly during January. Bins III and IV in Figure 4 are from regions at lower latitude showing that CDFs from PERSIANN-CCS and PMW are close to each other in January which means that their rainfall estimates are similar. Bin V shows the empirical CDFs for a bin in the Southern Hemisphere showing that PERSIANN-CCS underestimates the rainfall over the selected bin during January. By comparing the CDFs over different regions it can be concluded that we can expect more adjustment over areas in higher latitudes. The areas in lower latitudes near tropical regions show minor adjustment is needed since the CDFs are similar.
Similarly, the CDFs of PMW and PERSIANN-CCS rainfall estimation based on the 4 years of concurrent samples are shown for July, a summer month in the Northern Hemisphere. Bins I, II, and V selected over high-latitude regions show CDFs from PMW and CCS. By comparing these two CDFs it can be seen that adjustment of PERSIANN-CCS data toward PMW data could be more effective compare with tropical The comparison between the CDFs shows that the adjustment of PERSIANN-CCS estimation over high-latitude regions could be more significant than that in tropical regions. The CDFs for January and July show that the rainfall distribution follows the same pattern during cold and warm seasons with more overestimation or underestimation over the midlatitude and high-latitude (60°N to 30°N and 60°S to 30°S) regions.

Validation of Recalibrated PERSIANN-CCS Using PMW Rainfall
One year of data (2012) is used to validate the microwave-adjusted PERSIANN-CCS (hereafter MA-PERSIANN-CCS) estimates. Figure 5 shows the CDFs of precipitation measurement from PERSIANN-CCS estimation after adjustment for the validation year 2012 over several selected 5°× 5°subareas shown in   the solid green line for each subareas. Bins I and II in Figure 5 demonstrate the CDFs calculated for the selected regions over midlatitude in the Northern Hemisphere. Bin I overestimates the rainfall in winter and underestimates it in summer, and Bin II overestimates rainfall in winter and summer. The solid green line for these two bins shows that the CDF for MA-PERSIANN-CCS moved toward the PMW CDF which means that they have similar precipitation measurement during validation year. The graph for Bin III shows overestimation of rainfall during summer and winter before the adjustment, and after the adjustment the CDF for the MA-PERSIANN-CCS moves toward PMW CDF. Bin IV is also selected from tropical region. The CDF comparisons for both winter and summer time for Bin IV show that no  PERSIANN-CCS rainfall estimates from geostationary satellite imagery are available for every 30 min covering a near-global area from 60 N to 60°S and 180°W to 180°E. PMW precipitation estimation from LEO satellites is available at more limited spatial and temporal resolutions. The PMM method provides a systematic way to adjust the PDF of PERSIANN-CCS toward PMW rainfall estimation while keeping the recalibrated PERSIANN-CCS estimation at high spatial and temporal scales.
For evaluating the rainfall estimation from PERSIANN-CCS and MA-PERSIANN-CCS with PMW satellite data, the evaluation was done for three winter months (January, February, and December) and three summer months (June, July, and August) of two years 2011, which was included in the calibration period, and validation year 2012. Figures 6 and 7 show the monthly average rainfall based on the concurrent samples for the three winter months ( Figure 6) and three summer months (Figure 7) of 2011. Figures 8 and 9, similarly, depict the monthly averages for the winter and summer seasons, respectively, of validation year 2012. The circled Also, the CDF comparison shows that the adjustment in high-latitude regions is more significant than tropical areas. By comparing each data set a, b, and c in Figures 6 and 8 we can see more significant improvement   Figure 10 shows the map of sample counts for each pixel during summer (June, July, August) and winter (December, January, and February) 2012. From these figures it can be concluded that regions in the Northern Hemisphere between 60°N to 40°N, 0 to 120°E and 120°W to 60°W have lower sample counts for months of December, January, and February. Also areas located in the Southern Hemisphere from 0 to 60°S and 120°E to 60°W have low samples during summer and winter. Since this method is calibrated based Figure 10. Number of available concurrent data samples for each pixel for winter 2012 (December, January, and February) and summer 2012 (June, July, and August).  Tables 1 and 3 show the seasonal statistical evaluations for December, January, and February 2011 and 2012 accordingly (winter in Northern Hemisphere). As was discussed, during the winter season the evaluation is more reliable in Southern Hemisphere from 0 to 60°S. As we can see in the tables, zones located in the Southern Hemisphere have improved bias, correlation coefficient, and RMSE values after bias adjustment. Tables 2 and 4 show the seasonal statistical parameters calculated for the different zones for June, July, and August 2011 and 2012 (summer in Northern Hemisphere). As was discussed for sample size, areas located in the Northern Hemisphere have reliable sample counts for evaluation during summer time. Tables 2 and 4 show that bias, correlation coefficient, and RMSE have improved for MA-PERSIANN-CCS in comparison with PERSIANN-CCS during the summer season for both 2011 and 2012.

Validation Over CONUS
Additional validations were performed over the CONUS for PERSIANN-CCS and MA-PERSIANN-CCS for winter (December, January, and February) and summer (June, July, and August) 2012. Figure 11 shows the average monthly rainfall during the winter season over the U.S. based on the concurrent samples of PERSIANN-CCS, PMW, and Radar Q2 for 2012 (December, January, and February). As shown in the figure, PERSIANN-CCS overestimates the rainfall over western areas and underestimates over southern regions during the winter season of 2012 over the CONUS, while MA-PERSIANN-CCS shows closer estimates to PMW. Comparing average  [Behrangi et al., 2014].    (Tables 5 and 6). Table 5 gives the statistical results for PERSIANN-CCS and MA-PERSIANN-CCS compared to PMW, and Table 6 shows the statistical performance of PERSIANN-CCS, MA-PERSIANN-CCS, and PMW against radar data as ground truth. During winter 2012, bias, correlation coefficient, and RMSE between MA-PERSIANN-CCS and PMW shows improvement over those from PERSIANN-CCS and PMW. This implies that implementation of climatology data over the validation year 2012 has shifted the PERSIANN-CCS rainfall estimation toward PMW rainfall data. However, MA-PERSIANN-CCS does not show any improvement in terms of rainfall estimation for winter time compared with Radar Q2. The statistical parameters show that the bias value of À0.37 between PERSIANN-CCS and radar data decreased to À0.69 (more underestimation) between MA-PERSIANN-CCS and Radar Q2. The reason for this is due to inconsistencies between PMW and Radar Q2 data. Statistical validation results for summer are also exhibited in Tables 5 and 6. As expected, all the statistics for MA-PERSIANN-CCS show improvement from PERSIANN-CCS comparing with PMW. When compared to Radar Q2, MA-PERSIANN-CCS shows improvement in bias by 53% during the summer season with no significant changes in the other two statistics.
In summary we have provided a systematic approach for reducing the bias between the GEO-based PERSIANN-CCS and PMW estimates. The next step toward further improvement will be focused on increasing the population of PMW samples over high-latitude and cold seasons. The use of CloudSat and ground-based radars covering the higher latitudes are currently under study.

Discussion
In this study PMW sensors have been used to improve the quality of rain retrieval from a GEO-based satellite algorithm. The effectiveness of PMW data for improving the accuracy of precipitation estimation in some areas has been well established [for example, Behrangi et al., 2009]. PMW sensors ability to estimate precipitation over land and oceans is different, and its level of accuracy depends on the sensor characteristics, surface emission, and other factors. Over oceans and other water bodies, because of low emissivity and a homogenous background, the emissions from raindrops are more easily detectable for precipitation retrieval.
On the other hand, due to the heterogeneity of the surface and possible snow and ice coverage over land, the estimates have a higher uncertainty [Kidd and Levizzanni, 2011;Lin and Hou, 2008]. As a result, effective samples for high latitudes and cold seasons are very limited. Figure 3 shows the sample counts for each 5°× 5°bin for January and July during the calibration period, some regions located in higher latitudes have fewer samples compared with other areas. PMW rainfall estimations over frozen land and snowfall in highlatitude regions are marked with rain uncertainty flags, and as a result, the effective rain sample counts are significantly reduced. Even though, using PMM requires a large amount of samples to document rainfall distribution; in cold seasons over high latitudes, PMW rainfall samples are limited due to high uncertainty detecting rainfall over frozen surfaces and snowfalls. Experiments also show that PMW precipitation underestimates rainfall over high latitudes [Behrangi et al., 2014]. They are potential options for improving PERSIANN-CCS estimation in the follow-up activities, including (1) using ground radar over the CONUS for training PERSIANN-CCS over frozen land and then applying the PMM matching lookup tables to the highlatitude regions and (2) using multiple years of CloudSat products for training the PERSIANN-CCS over frozen land, and (3) using Global Precipitation Climatology Project rainfall estimation to train the PERSIANN-CCS. We will setup case studies and report our findings in the near future.

Conclusions
Scientific and operational requirements for precipitation observations from satellites will continue to grow and will naturally challenge the scientific community to improve the spatial and temporal resolutions as well as the quality and reliability of the data. Taking advantage of multisensor satellite observations in combination with different in situ measurements to generate a global precipitation map has become a primary goal of the research community. This research is in support of the GPM (IMERG) program with specific focus on improving the accuracy of PERSIANN-CCS, one of the elements of the IMERG Algorithm. PERSIANN-CCS with its unique structure as an IR-based algorithm fills the gaps between PMW rainfall data.
This paper presents the framework of a Probability Matching Method (PMM) for bias correction of PERSIANN-CCS using PMW precipitation estimates. The global evaluation of both a validation and a calibration year (winter and summer 2011 and 2012) shows that bias, correlation coefficient, and RMSE have been reduced in MA-PERSIANN-CCS, with respect to corresponding PMW data. The improvement of the rainfall estimates is more pronounced in high-latitude and midlatitude regions and less significant in tropical regions, which require much less adjustment. MA-PERSIANN-CCS was compared with PMW data based on the concurrent samples for summer and winter of 2011 and 2012. The results show that rainfall estimates from MA-PERSIANN-CCS have better correspondence with PMW data while maintaining PERSIANN-CCS's higher spatial and temporal resolutions.
Validation of data was also done over the CONUS using Q2 radar data as a ground reference and also PMW to compare with PERSIANN-CCS and MA-PERSIANN-CCS rainfall estimates. The comparison with PMW shows that bias, correlation coefficient, and RMSE for MA-PERSIANN-CCS was improved compared with PERSIANN-CCS estimates in winter and summer 2012. Additionally, the data sets were compared with Radar Q2 over the CONUS for winter and summer. MA-PERSIANN-CCS did not exhibit any significant improvement with respect to Radar Q2 data during the winter time due to the high level of uncertainties associated with the inability of PMW sensors to detect precipitation over frozen land and lighter precipitation during winter seasons. During the summer season, MA-PERSIANN-CCS shows less bias compared to Radar Q2 than did PERSIANN-CCS while correlation coefficient and RMSE did not improve significantly.
The next step toward further improvement will be focused on increasing the number of samples for high latitudes during cold seasons. The use of CloudSat and ground-based radars covering the higher latitude is currently under study.