Skip to main content
eScholarship
Open Access Publications from the University of California

Value-of-Information based Arbitration between Model-based and Model-freeControl

Creative Commons 'BY' version 4.0 license
Abstract

There have been numerous attempts in explaining the general learning behaviours using model-based and model-freemethods. While the model-based control is flexible yet computationally expensive in planning, the model-free control isquick but inflexible. Multiple arbitration schemes have been suggested to achieve the data efficiency and computationalefficiency of model-based and model-free control schemes, respectively. In this context, we propose a quantitative ’value-of-information’ based arbitration between both the controllers in order to establish a general computational frameworkfor skill learning. The interacting model-based and model-free reinforcement learning processes are arbitrated using anuncertainty-based value-of-information estimation. We further show that our algorithm performs better than Q-learning aswell as Q-learning with experience replay.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View