Skip to main content
eScholarship
Open Access Publications from the University of California

Forced Simple Recurrent Neural Networks and Grammatical Inference

Abstract

A simple recurrent neural network (SRN) introduced by Elman [l990] can be trained to infer a regular grammar from the positive examples of symbol sequences generated by the grammar. The network is trained, through the back-propagation of error, to predict the next symbol in each sequence, as the symbols are presented successively as inputs to the network. The modes of prediction failure of the SRN uchitecture are investigated. The SRN's internal encoding of the context (the previous symbols of the sequence) is found to be sufficiently developed when a particular aspect of context is not required for the immediate prediction at some point in the input sequence, but is required later. It is shown that this mode of failure can be avoided by using the auto-associative recurrent network (AARN). The AARN architecture contains additional output units, which are trained to show the current input and the current context. The effect of the size of the training set for grammatical inference is also considered. The RN has been shown to be effective when trained on an infinite (very large) set of positive examples [Servan-Schreiber et al, 1991]. When a finite (small) set of positive training data is used, the SRN architectures demonstrate a lack of generalization capability. This problem is solved through a new training algorithm that uses both positive and negative examples of the sequences. Simulation results show that when there is restriction on the number of nodes in the hidden layers, the AARN succeeds in the cases where the SRN fails.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View