Discriminative Acoustic Features for Deployable Speech Recognition
- Faria, Arlo
- Advisor(s): Morgan, Nelson
Abstract
This work explores discriminative acoustic features: audio signal representations produced by multilayer perceptrons, such as Tandem and bottleneck features used in state-of-the-art automatic speech recognition systems. Experimental results highlight the factors that influence performance in terms of accuracy and speed; novel approaches are introduced to provide improvement in both regards. The overall emphasis is on discovering techniques that are suitable for practical deployment, translating effectiveness beyond a traditional research setting. Applications include real-time low-latency audio stream processing on mobile devices - as well as systems that are built with low-quality training data and must encounter the diverse complications of "real-world" use cases.