Skip to main content
eScholarship
Open Access Publications from the University of California

Dynamic Reinforcement Driven Error Propagation Networks with Application to Game Playing

Abstract

This paper discusses the problem of the reinforcement driven learning of a response to a time varying sequence. The problem has three parts: the adaptation of internal parameters to model complex mappings; the ability of the architecture to represent time varying input; and the problem of credit assignment with unknown delays between the input, output and reinforcement signals. The method developed in this paper is based on a connectionist network trained using the error propagation algorithm with internal feedback. The network is viewed both as a context dependent predictor of the reinforcement signal and as a means of temporal credit assignment. Several architectures for these networks are discussed and insight into the implementation problems is gained by an application to the game of noughts and crosses.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View