Speech Analysis Methodologies towards Unobtrusive Mental Health Monitoring
The human voice encodes a wealth of information about emotion, mood and mental states. With the advent of pervasively available speech collection methods (e.g., mobile phones) and the low-computation costs of speech analysis, it suggests that non-invasive, relatively reliable, and modestly inexpensive platforms are available for mass and long-term deployment of a mental health monitor. In the thesis, I describe my investigation pathway on speech analysis to measure a variety of mental states, including affect and those triggered by psychological stress and sleep deprivation.
This work has contributions in many folds, and it brings together techniques from several areas, including speech processing, psychology, human-computer interaction, and mobile computing systems. First, I revisited emotion recognition methods by building an affective model with a naturalistic emotional speech dataset, which is consisted of a realistic set of emotion labels for real world applications. Then, leveraging the speech production theory I verified that the glottal vibrational cycles, the source of speech production, are physically affected by psychological states, e.g., mental stress. Finally, I built the AMMON (Affective and Mental health MONitor) library, a low footprint C library designed for widely available phones as an enabler of applications for richer, more appropriate, and more satisfying human- computer interaction and healthcare technologies.