Skip to main content
eScholarship
Open Access Publications from the University of California

Priority-Adjusted Replay for Successor Representations

Abstract

Intelligent agents are capable of transfer and generalization. This flexibility in adapting to new tasks and environments often relies on representation learning and replay. Among these algorithms, successor representation learning and memory replay offer biologically plausible solutions. However, replay prioritization algorithms remain largely limited to value prediction errors. Here we propose PARSR, Priority-Adjusted Replay for Successor Representations, to address this caveat. Decoupling learning of the environment dynamics and rewards, PARSR can use prediction errors from either representation learning or values to prioritize memory replay. We compare PARSR to SR-Dyna, Dyna-Q, and a number of state of the art algorithms using replay and successor representations in cognitive neuroscience. We find that PARSR is able to reproduce human behavior in a number of revaluation tasks while also representing a performance improvement over SR-Dyna, its closest counterpart.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View