Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Acquiring latent linguistic structure using computational models

Abstract

Language contains a great deal of latent structure, which shapes the produced linguistic forms but does not directly appear in them. Identifying this latent structure is both a core goal of linguistic theories and the task confronting a child acquiring language. This dissertation investigates the acquisition of latent linguistic structure using computational models over a range of linguistic phenomena. These models share two central features. First, they rely as much as possible on observed language data to determine the latent structure. This supports emergentist accounts of acquisition, where a learner relies primarily on cognitively-general learning methods to extract the structure from the data, as opposed to innatist accounts, which rely primarily on substantial innate foreknowledge of the linguistic structure. Second, the models combine general Bayesian analysis principles with appropriate representations of the linguistic structure for each problem to maximize the information they obtain from the language data. Four models are proposed in this dissertation. Two examine the source of phonological constraints in Optimality Theory, and argues that they can be acquired from language data with little to no innate phonological structure, contrary to the standard, innatist position. The third model addresses early word segmentation from speech, and shows that a Bayesian model for incorporating multiple cues outperforms a single-cue model on segmentation, as well as providing a possible explanation for human segmentation behavior. The final model moves into applications, and shows that accurately representing the bursty behavior of the language data improves the fit of a topic model

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View