Skip to main content
eScholarship
Open Access Publications from the University of California

Efficiency of Learning in Experience-Limited Domains:Generalization Beyond the WUG Test

Abstract

Learning to read English requires learning the complex statis-tical dependencies between orthography and phonology. Pre-vious research has focused on how these statistics are learnedin neural network models provided with as much training asneeded. Children, however, are expected to acquire this knowl-edge in a few years of school with only limited instruction. Weexamined how these mappings can be learned efficiently, de-fined by tradeoffs between the number of words that are explic-itly trained and the number that are correct by generalization.A million models were trained, varying the sizes of randomly-selected training sets. For a target corpus of about 3000 words,training sets of 200–300 words were most efficient, producinggeneralization to as many as 1800 untrained words. Composi-tion of the 300 word training sets also greatly affected general-ization. The results suggest directions for designing curriculathat promote efficient learning of complex material.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View