Sparse Coding of Speech Data Predicts Properties of the Early Auditory System
- Author(s): Carlson, Nicole
- Advisor(s): DeWeese, Michael R
- et al.
I have developed a sparse mathematical representation of speech that minimizes the number of active model neurons needed to represent typical speech sounds. The model learns several well-known acoustic features of speech such as harmonic stacks, formants, onsets and terminations, but I also find more exotic structures in the spectrogram representation of sound such as localized checkerboard patterns and frequency-modulated excitatory subregions flanked by suppressive sidebands. Moreover, several of these novel features resemble neuronal receptive fields reported
in the Inferior Colliculus (IC), as well as auditory thalamus and cortex, and my model neurons exhibit the same tradeoff in spectrotemporal resolution as has been observed in IC. To my knowledge, this is the first demonstration that receptive fields of neurons in the ascending mammalian auditory pathway beyond the auditory nerve can be predicted based on coding principles and the
statistical properties of recorded sounds. In my second study, I look at linear filter estimation by creating spike-triggered averages for my model neurons. Surprisingly, whitening does not remove
the effect of choosing different probe stimulus sets. This suggests that the type of probe stimulus is very important for uncovering the true receptive field of a neuron.