Skip to main content
eScholarship
Open Access Publications from the University of California

Distributional Bootstrapping : From Word Class to Proto-Sentence

Abstract

There have been various suggestions about how children might acquire a proto-ciassification of elements of natural language, such as is conjectured to be necessary to allow the child to "bootstrap" language acquisition (Maratsos 1979; Pinker 1984). One, proposed by Kiss (1972) and Maratsos (1979), but criticised by Pinker (1984), is that children look for distributional correlations between simple linguistic phenomena in the lemguage they heax in order to derive more sophisticated abstract linguistic classifications. Finch & Chater (1992) showed that a relatively complete syntactic classification of the lexicon could be found for common words in natured language using distributional bootstrapping. This paper reviews some of the cirguments Pinker raises against distributional methods, and then describes a system which overcomes his objections, where sequences of words are classified into phrasal classes by a linguisticadly naive statistical aneilysis of distributional regularities fi-om a large, noisy, untagged corpus. For many classes, such as sentence and verb phrase, the accuracy of the classification (ie. the proportion of putative sentences which can in fact be linguistically interpreted as sentences) is in the region of 90%, thus enabling the child to break the "bootstrapping problem".

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View