This dissertation is centered around the concept of uncertainty-aware reinforcement learning (RL), which seeks to enhance the efficiency of RL by incorporating uncertainty. RL is a vital mathematical framework in the field of artificial intelligence (AI) for creating autonomous agents that can learn optimal behaviors through interaction with their environments. However, RL is often criticized for being sample inefficient and computationally demanding. To tackle these challenges, the primary goals of this dissertation are twofold: to offer theoretical understanding of uncertainty-aware RL and to develop practical algorithms that utilize uncertainty to enhance the efficiency of RL.
Our first objective is to develop an RL approach that is efficient in terms of sample usage for Markov Decision Processes (MDPs) with large state and action spaces. We present an uncertainty-aware RL algorithm that incorporates function approximation. We provide theoretical proof that this algorithm achieves near minimax optimal statistical complexity when learning the optimal policy. In our second objective, we address two specific scenarios: the batch learning setting and the rare policy switch setting. For both settings, we propose uncertainty-aware RL algorithms with limited adaptivity. These algorithms significantly reduce the number of policy switches compared to previous baseline algorithms while maintaining a similar level of statistical complexity. Lastly, we focus on estimating uncertainties in neural network-based estimation models. We introduce a gradient-based method that effectively computes these uncertainties. Our approach is computationally efficient, and the resulting uncertainty estimates are both valid and reliable.
The methods and techniques presented in this dissertation contribute to the advancement of our understanding regarding the fundamental limits of RL. These research findings pave the way for further exploration and development in the field of decision-making algorithm design.