Cortical encoding and decoding models of speech production
- Author(s): Chartier, Josh
- Advisor(s): Chang, Edward F
- et al.
To speak is to dynamically orchestrate the movements of the articulators (jaw, tongue, lips, and larynx), which in turn generate speech sounds. It is an amazing mental and motor feat that is controlled by the brain and is fundamental for communication. Technology that could translate brain signals into speech would be transformative for people who are unable to communicate as a result of neurological impairments. This work first investigates how articulator movements that underlie natural speech production are represented in the brain. Building upon this, this work also presents a neural decoder that can synthesize audible speech from brain signals. Data to support these results were from direct cortical recordings of the human sensorimotor cortex while participants spoke natural sentences. Neural activity at individual electrodes encoded a diversity of articulatory kinematic trajectories (AKTs), each revealing coordinated articulator movements towards specific vocal tract shapes. The neural decoder was designed to leverage the kinematic trajectories encoded in the sensorimotor cortex which enhanced performance even with limited data. In closed vocabulary tests, listeners could readily identify and transcribe speech synthesized from cortical activity. These findings advance the clinical viability of using speech neuroprosthetic technology to restore spoken communication.