Criticality-Based Advice in Reinforcement Learning
Skip to main content
eScholarship
Open Access Publications from the University of California

Criticality-Based Advice in Reinforcement Learning

Creative Commons 'BY' version 4.0 license
Abstract

One of the ways to make reinforcement learning (RL) more ef- ficient is by utilizing human advice. Because human advice is expensive, the central question in advice-based reinforcement learning is, how to decide in which states the agent should ask for advice. To approach this challenge, various advice strate- gies have been proposed. Although all of these strategies dis- tribute advice more efficiently than naive strategies (such as choosing random states), they rely solely on the agent’s inter- nal representation of the task (the action-value function, the policy, etc.) and therefore, are rather inefficient when this rep- resentation is not accurate, in particular, in the early stages of the learning process. To address this weakness, we propose an approach to advice-based RL, in which the human’s role is not limited to giving advice in chosen states, but also includes hint- ing apriori (before the learning procedure) which sub-domains of the state space require more advice. Specifically, we sug- gest different ways to improve any given advice strategy by utilizing the concept of critical states: states in which it is very important to choose the correct action. Finally, we present ex- periments in 2 environments that validate the efficiency of our approach.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View