Streamflow simulation and forecasting is an important approach for water resources management and flood mitigation, and both physically-based and data-driven models have been used to predict the streamflow at daily and hourly scale. Although physically-based models can help understand the underlying mechanisms of the hydrological process based on mathematical and physical equations, the performance of the models largely depends on the availability and quality of the spatial information. The computation cost also limits its application in the fast, reliable and accurate streamflow simulation and forecasting. In contrast, data-driven models, especially deep learning models, are capable of learning the nonlinear relationships between the inputs and outputs without explicitly referring to the physical process. In addition, deep learning techniques, powered by graphics processing units (GPUs) can allow researchers to train models 10 or 20 times faster, which is more promising for inputs-outputs simulation.
Currently, the deep learning models rely heavily on prior streamflow information to make predictions, while the hydrological observations are limited in the real world applications. In this Thesis, the effectiveness of deep learning models on streamflow simulation without using the previous streamflow information was explored. In specific, the three deep learning models the ANN, RNN and LSTM models were used to simulate the streamflow at multiple lead time and different sites using the precipitation and soil moisture. In addition, the effectiveness of deep learning models on learning from a physically-based model to generate pixel-to-pixel streamflow mapping has also been tested.
All three deep learning models can achieve reliable and accurate streamflow simulations at the lead time of 1-5 h at the Ninnekah site with Pearson correlation coefficient (r) over 0.9 for all three models. Based on the prediction of the peak and low flow on different sites, the LSTM model shows more consistent and reliable streamflow simulation results and can work better in different basin scales. For the U-net model, it can learn the patterns of ParFlow-CLM model well and achieves average r of 0.770 and 0.820 in the whole domain for validation and calibration, respectively. However, the models have limitations in predicting the peak due to the feature of the imbalanced data.