Skip to main content
Open Access Publications from the University of California

Learning to count: a neural network model of the successor function


What does it mean for a neural network to become a “cardinal principal knower”? We trained a multilayer perceptron to compute the successor of the numbers 0-99. N and N+1 were one-hot encoded on the input and output layers, respectively; the hidden layer had 8 units. 80% of the (N, N+1) pairs constituted the training data, the remaining 20% the test data. The mean cosine similarities of the hidden layer representations of the (N, N+1) pairs was 0.77 (0.79) when N was in the training (test) set. The network learned a continuous notion of number: the hidden-layer representations of N and N+1 were comparable whether they did (0.74) or did not (0.78) cross a tens boundary. Thus, the network did not “discover” place-value. Ongoing research is exploring place-value encoding of inputs and outputs, and also structuring of the training data to better reflect the numerical environment of the child.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View