A Framework For Hydroclimate Prediction and Discovery Using Object-Oriented Data
This thesis introduces a new object-oriented precipitation data set and explores statistical methods that can be used for predicting monthly precipitation and discovering the impact of climate variability on precipitation. The object-oriented data set consists of segmented, near global, satellite precipitation data characterized into four-dimensional (4D) objects (longitude, latitude, time and intensity). We use the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) .25-degree dataset, which covers from 60N to 60S and from March 1st, 2000 to January 1st, 2011 as our source data. This data set is the called PERSIANN-CONNected objECT (CONNECT) and is stored in a PostgreSQL database. Using this novel data set we propose a prediction and discovery framework that 1) empirically studies the monthly precipitation systems, 2) builds accurate prediction models, and 3) estimates the relevance of the features included in a data matrix of attributes. We use four machine learning models, 1) Lasso, 2) Elastic Net, 3) Gradient Boosting Trees, and 4) Extremely Random Trees, combined with model validation, using a leave one out (LOO) prediction strategy and confidence estimation using bootstrap resampling that is applied to a precipitation prediction problem. Our case study focuses on a subset population of 626 Western U.S. precipitation systems. The study shows the joint interactions of the selected climate phenomena: 1) Arctic Oscillation (AO), 2) El Nino Southern Oscillation (ENSO) and 3) Madden Julian Oscillation (MJO) on these 626 precipitation systems by analyzing the increased/decreased likelihood of having precipitation systems occurring over the Western U.S. In addition, this dissertation finds that the machine learning methods produce accurate monthly precipitation frequency predictions, comparable to climatology at different monthly lead times and identify relevant features that correspond to interacting modes of climate, such as the Western Hemisphere Warm Pool (WHWP), Atlantic Meridional Mode Sea Surface Temperatures (AMMSST), North Pacific Index (NP) and the South West Monsoon Index (SWMONSOON) leading to alternate physical explanations of Western U.S. precipitation variability. Given the importance of monthly prediction in water resource planning and management, this framework provides an approach to understanding Western U.S. precipitation, and even more importantly, an approach that can be applicable to study precipitation around the world.