Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

The analysis and applications of Subglottal Resonances in height estimation and speaker identification and normalization

Abstract

The subglottal acoustic system refers to the acoustic system below the glottis, which consists of the trachea, bronchi and lungs. Compared to the supraglottal system, the configuration of the subglottal system is relatively fixed and more speaker dependent. Previous research showed that the natural frequencies of the subglottal system, which are referred to as subglottal resonances (SGRs), form the boundaries of vowel classes for several languages. Results in previous studies also indicate that SGRs correlate well with the standing height for adult speakers. Motivated by these properties, SGRs have been used in different applications including adults’ height estimation and speaker normalization for automatic speech recognition (ASR). In this thesis, our knowledge of SGRs is leveraged to extend the utility of SGRs to more language and applications, including: (1) finding the relationship between SGRs and vowel class, speaker height and F0 variation for Mandarin speakers; (2) finding the relationship between SGRs, height estimation and speaker normalizations for children’s speech; (3) the investigation of SGRs for speaker identification (SID) under noisy conditions.

The results indicate that SGRs, similar to English speakers, divide the vowel space for Mandarin speakers and there exist strong inverse relationships between SGRs and speaker height, and between SGRs and trunk length. Moreover as a tonal language, while F0 varies over time within a vowel, there is no statistically-significant variation of SGRs in Mandarin speech. For the study of children speech, an age-dependent SGRs estimation algorithm is designed. The experiments show that the algorithm is effective for children height estimation and speaker normalization. For SID, SGRs are used as noise robust features to provide complementary information to state-of-the-art noise robust features, such as power normalized cepstral coefficients. A two-stage framework is developed and the results show that SGRs provide significant performance improvements.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View