Skip to main content
eScholarship
Open Access Publications from the University of California

UCSF

UC San Francisco Electronic Theses and Dissertations bannerUCSF

AI-driven brain-computer interfaces for speech

Abstract

Speech is a fundamental human behavior, and enables the fast, effortless expression of emotions, desires, and needs. Devastating conditions like paralysis and brain-stem stroke can rob individuals of the ability to speak, even though they retain intact cognitive abilities. Brain-Computer Interface (BCI) technology offers hope for such individuals by reading out these intact neural signals using a recording device, then deciphering what the person was trying to say using machine learning and artificial intelligence. Prior to beginning this thesis, many questions remained in the development of speech-BCIs. Could speech be decoded from the brain of a person who hadn't spoken for many years and was suffering from paralysis? What algorithms could do this? What recording technologies could be used? Would the brain activity look similar to healthy speakers, or would it have evolved? Could we enable someone to speak quickly and naturalistically with these devices?

With these questions in mind, we launched the BCI restoration of arm and voice (BRAVO) clinical trial. This trial explores the use of Electrocorticography (ECoG), a high-resolution neural recording modality, to read out and decode neural activity during intended speech. This thesis presents results on work I have done as part of this clinical trial, and demonstrates successful speech decoding with two incredible participants who have each been unable to speak for over 15 years.

Chapter 1 introduces a proof-of-concept speech BCI in someone who cannot speak, showcasing real-time decoding of 50 words and restoring communication at 15 words per minute. Chapter 2 expands the scope of speech BCIs, enabling speech-based spelling with NATO codewords, achieving 29 characters per minute and allowing communication with a practically unlimited vocabulary. We also show silent-speech attempts can be decoded, and that low-frequency neural activity (between .6 and 16.67 Hz) is critical to decoding. Chapter 3 introduces a state-of-the-art BCI which can decode text from neural activity during silent speech attempts at 80 words per minute. Sound units and facial gestures were also predictable, enabling auditory and visual representations of attempted speech. These advancements illustrate the potential of speech BCIs to restore communication for those who have lost their voice, and demonstrated that speech is

Taken together, this research presents progress in the development of speech BCIs and signifies a significant step towards improving the lives of individuals with communication impairments through versatile and comprehensive speech restoration systems.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View