Skip to main content
eScholarship
Open Access Publications from the University of California

Understanding Children’s Speech Productions: Man Versus Machine

Abstract

Young childrens speech pronunciations deviate systematically from adult forms. For example, onsets are often simplified(e.g., stop becomes top), unstressed syllables frequently deleted (e.g., spaghetti becomes getti), and certain segments arecommonly replaced with other ones (e.g., rice becomes wice). The current study examined how well adults and a popularautomatic speech recognition system (i.e., Siri) deal with these deviations. The same 12 children were recorded producing32 words in isolation at three ages: 2.5, 3.5, and 5.5 years. 12 adults were also recorded. These recordings were presentedto 48 young adults, 7 mothers, and Siri for transcription. All listeners performed worst with 2.5-year-old productions,and humans outperformed Siri with all ages (p ¡ 0.001). Mothers demonstrated the highest accuracy with 2.5-year-oldproductions (86%). Additionally, Siri made distinctive transcription errors with childrens speech. These errors may reflectthe systems lack of training with young childrens voices.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View