Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Self-directed learning in language development: Interactions of linguistic complexity, learner attention, and language socialization

No data is associated with this publication.
Abstract

Children are famously scrappy learners: curious, active, and resourceful. And yet when we consider their development of language — a complex social system that children are highly motivated to master — we tend to study them as passive recipients of adult guidance. This overlooks language development as a fruitful domain in which to study children’s self-directed learning, as well as insights that recent active learning frameworks could bring to language development. In this dissertation, I discuss language development as a coordinated process between communicative adults and increasingly active learners. In particular, I see children’s learning from speech not directed to them, but rather overheard, as a uniquely ecologically valid test case of their self-directed learning capabilities. A combination of experimental and computational studies from this perspective speak to an apparent paradox in the language development literature: while studies testing correlations between sources of language input in toddlers’ home environments and later vocabulary growth have been taken to indicate that overheard speech is ineffective for word-learning, numerous experimental studies show in-lab learning from simplified indirect speech during the same period. The idea, borrowed from experiments with infants, that children may disattend to stimuli that are too complex for their current level of competence may help explain these conflicting results. That is, young rational learners may initially learn little from overhearing because the speech that surrounds them is too complex to maintain their attention — especially when compared to the speech that they receive in interactions with adults.

A first study compares multiple empirically-motivated metrics of speech complexity in large-scale longitudinal child-directed corpora, and overheard speech simulated via corpora of adult-adult conversations. We find that words in simulated overheard speech are likely to be less concrete, more unpredictable, later-acquired, and lower frequency than words in speech to children. This is likely to be true through at least the first four years of life, spanning the period when measurements of overheard speech quantity in children’s environments have repeatedly been found to be unrelated to children’s early vocabularies.

Across three studies in the second chapter, we test children’s ability to learn from dense, naturalistic overheard speech in a context designed to place significant demands on their self-directed learning abilities, including their spontaneous recognition of an “information gap” and independent information-gathering. In contrast to previous laboratory experiments — but consistent with many overhearing opportunities day-to-day — the speech we used included multiple pieces of novel linguistic information, embedded in diverse sentence structures, and delivered in the register and rate typical of adult-adult conversations. While all children in our sample were able to learn a set of 5–6 novel facts, only older preschoolers (Mage = 5.1 years) demonstrated robust learning of novel words through overhearing. Analyses of children’s play and gaze behavior during the overhearing episode suggest that older children’s success is owed at least in part to their enhanced ability to coordinate attention between the referential context and the nearby speech.

In the third chapter, we develop a novel method to test the classic idea that children learn best from information that is of an appropriate level of complexity for them, and in par- ticular the role that children themselves might play in actively selecting and attending to potential sources of information. By measuring children’s attention to a story narrated at distinct levels of verbal complexity — operationalized in terms of words’ estimated age of acquisition — we find evidence that children attend more to speech that is more appropriate for their level of competence. Furthermore, while previous research has assumed that chil- dren’s attention and learning are meaningfully related, our method provides direct evidence, as children’s self-directed attention to the story predicted their comprehension of and ability to learn new words from it.

Inspired by qualitative studies typically limited to child-directed speech, in the fourth chapter, we develop a coding scheme that enables us to characterize the full range of potential sources of language accessible to a given child, in terms of their relative utility for word- learning. In applying this scheme to longitudinal video data from the home of a single English-learning child, we find that features that contribute to the referential transparency and salience of an utterance (in and out of the laboratory), are not exclusive to child-directed speech, but rather occur with some lower frequency in overheard speech as well. In light of this, our analyses suggest a functional role for caregivers’ exaggerated prosody in distinguish- ing speech intended for the child, and as a self-reinforcing cue to language where the child’s attention is likely to be rewarded. Through this fine-grained coding of individual utterances in context, our results thus uncover dynamics in how adults and children co-structure the early language environment — and how the landscape itself shifts with the child’s maturation — which are otherwise hidden from more quantitative approaches.

Ongoing work extends the ideas in the dissertation to new contexts and populations, beginning by employing the same scheme to describe crosslinguistic learning environments, facilitating contact with more humanistic fields like anthropology. The fifth chapter adapts existing measures of implicit word knowledge to test culturally specific language knowledge in Tseltal Maya infants, who primarily overhear. While the preceding chapters challenge our assumptions of how language is typically learned, this work aims to expand our (testable) notions of what counts as legitimate language knowledge. The experimental studies in this dissertation share a focus on using naturalistic speech and ecologically valid contexts, and together point to the role of domain-general processes like attention, information processing, and learner adaptation in the course of language development.

Main Content

This item is under embargo until October 30, 2025.