For over half a century, computer scientists and psychologists have strived to build machines that teach humans automatically, sometimes dubbed intelligent tutoring systems (ITS). The earliest such systems focused on "flashcard"-style vocabulary learning, while more modern ITS can tutor students in diverse subjects such as high school geometry, physics, algebra, and computer programming. Compared to human tutors, however, most contemporary ITS still use a rather impoverished set of low-bandwidth sensors consisting of mouse clicks, keyboard strokes, and touch events. In contrast, human teachers utilize not only students' explicit responses to practice problems and test questions, but also auditory and visual information about the students' affective, or emotional, states, to make decisions. It is possible that, if automated teaching systems were affect-sensitive and could reliably detect and respond to their students' emotions, then they could teach even more effectively. In this dissertation we examine the affect-sensitive teaching problem from a stochastic optimal control (SOC) perspective. Stochastic optimal control theory provides a rigorous computational framework for describing the challenges and possible benefits of affect-sensitive teaching systems, and also provides computational tools that may help in building them. After framing the problem of affect-sensitive teaching using the language of SOC, we (1) present an experimental technique for measuring the importance to teaching of affect-sensitivity within a given learning domain. Next, we develop machine learning and computer vision tools to recognize automatically certain aspects of the student's affective state in real- time, including (2) student "engagement" and (3) the student's perception of curriculum difficulty. Finally, (4) we propose and evaluate an automated procedure, based on SOC, for creating an automated teacher that teaches foreign language by image association (a la Rosetta Stone). In a language learning experiment on 90 human subjects, the controller developed using SOC showed higher learning gains compared to two heuristic controllers, and also allows for affective observations to be easily integrated into the decision-making process