Trial-and-error skill learning involves generating variation in behavioral performance ('exploratory variation') and modifying the motor program to produce the behavioral variants associated with better reinforcement. This thesis addresses the computational rules and neural circuitry involved in trial-and-error skill learning.
We investigated trial-and-error learning using adult birdsong as a model system. Adult Bengalese finch song contains a sequence of syllables, 50-100ms vocal gestures with characteristic acoustic structure. In each experiment, we delivered differential reinforcement contingent on fundamental frequency performance in a targeted syllable of adult song. Such reinforcement elicited modification of the fundamental frequency of the targeted syllable by trial-and-error.
First, we found that the nervous system keeps track of fine-grained exploratory variation in fundamental frequency and uses this information for trial-and-error learning. Songbirds learned to produce the average of fundamental frequency trajectories that were associated with better reinforcement. This learning rule accurately predicted learned changes in fundamental frequency trajectory on a timescale of milliseconds. Such a capacity of the nervous system to exploit fine-grained behavioral variation for trial-and-error learning could support the acquisition of fine motor skills like human speech.
Second, we found a surprising difference between the neural circuitry that generates exploratory variation and the neural circuitry that exploits such variation for trial-and-error learning. During training with differential reinforcement, we blocked the output of the AFP, a cortical-basal ganglia circuit that transmits exploratory variation and is necessary for song learning. This prevented the AFP from transmitting exploratory variation and prevented learning during training; nevertheless, robust and precise learning appeared immediately when we unblocked AFP output following training. Moreover, inactivating a region within the AFP during training prevented learned changes to song both during and after training, confirming that the AFP is necessary for learning. These results suggest that the AFP receives information about exploratory variation generated by other premotor regions (an 'efference copy'), exploits this information for learning, and implements the learned behavior.
Together, our results indicate that cortical-basal ganglia circuits contribute to trial-and-error learning by associating reinforcement signals with exceptionally detailed information about exploratory variation generated throughout the brain.