- Main
Value-of-Information based Arbitration between Model-based and Model-freeControl
Abstract
There have been numerous attempts in explaining the general learning behaviours using model-based and model-freemethods. While the model-based control is flexible yet computationally expensive in planning, the model-free control isquick but inflexible. Multiple arbitration schemes have been suggested to achieve the data efficiency and computationalefficiency of model-based and model-free control schemes, respectively. In this context, we propose a quantitative ’value-of-information’ based arbitration between both the controllers in order to establish a general computational frameworkfor skill learning. The interacting model-based and model-free reinforcement learning processes are arbitrated using anuncertainty-based value-of-information estimation. We further show that our algorithm performs better than Q-learning aswell as Q-learning with experience replay.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-