Skip to main content
eScholarship
Open Access Publications from the University of California

PARSNIP: A Connectionist Network that Learns Natural Language Grammar from Exposure to Natural Language Sentences

Abstract

Linguists have pointed out that exposure to language is probably not sufficient for a general,domain-independent, learning mechanism to acquire natural language grammar. This "povertyof the stimulus" argument has prompted linguists to invoke a large innate component inlanguage acquisition as well as to discourage views of a general learning device (GLD) forlanguage acquisition. W e describe a connectionist non-supervised learning model (PARSNIP^)that "learns" on the basis of exposure to natural language sentences from a million wordmachine-readable text corpus (Brown corpus). PARSNIP, an auto-associator, was shown threeseparate samples consisting of 10, 100 or 1000 syntactically tagged sentences, each 15 words orless. The network leamed to produce correct syntactic category labels corresponding to eachposition of the sentence originally presented to it, and it was able to generalize to another 1000sentences which were distinct from all three training samples. PARSNIP does sentencecompletion on sentence fragments, prefers syntactically correct sentences, and also recognizesnovel sentence patterns absent from the presented corpus. One interesting parallel betweenPARSNIP and human language users is the fact that PARSNIP correctly reproduces testsentences reflecting one level deep center-embedded patterns which it has never seen beforewhile failing to reproduce multiply center-embedded patterns.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View