Cortical representation of vocal pitch during speech perception in human superior temporal gyrus
- Author(s): Tang, Claire
- Advisor(s): Chang, Edward
- et al.
Pitch plays a crucial role in all spoken languages. In tone languages, pitch is used to distinguish between different words, such that the same syllable can have multiple lexical meanings depending on its pitch contour. In all other spoken languages, pitch conveys linguistic meaning at the sentence level through speech intonation. For example, in English, raising the pitch at the end of an utterance can change a statement into a question. Despite the importance of pitch for spoken language, we have limited understanding of how the human brain processes speech to represent pitch that is linguistically relevant. One difficulty that arises for the encoding of linguistic meaning in pitch is that the vocal pitch range varies vastly across different people. Thus, languages cannot use absolute values of pitch to convey meaning since some values of high absolute pitch may be out of the range of a low-pitched speaker, and vice versa. Instead, linguistic meaning must be transferred through a speaker-normalized representation of pitch. This dissertation seeks to understand how the human auditory cortex represents pitch information during speech perception. Using electrocorticography to record neural activity directly from the cortical surface of participants as they listen to both natural speech and controlled speech stimuli, I discovered populations of neurons in the human superior temporal gyrus that have activity patterns that differentiate lexical tones and intonation contours. These neural populations are separate from the neural populations that encode the phonetic features that make up different consonants and vowels and from the neural populations that encode information about speaker identity. Furthermore, I show that the activity of these tone and intonation neural populations can be explained by the encoding of speaker-normalized relative pitch and pitch change. These results suggest that human auditory cortex processes speech to extract vocal pitch and abstracts absolute pitch values to encode linguistically relevant, speaker-normalized pitch information at the level of human non-primary auditory cortex.