Static type systems are a powerful tool for reasoning about the safety of programs. Global type inference eliminates one of the prime complaints against static types, that the annotation burden is too high. However, this introduces its own problems as the type checker must now make assumptions about what the programmer intended to do. A single incorrect assumption can lead the type checker to erroneously blame an expression far from the actual error the programmer made, which can be particularly confusing for newcomers who have not yet constructed a mental model for how the type checker works.
In this dissertation we present a pair of complementary techniques to localize and explain type errors, with an emphasis on the errors encountered by novice users.
We tackle the localization problem by using machine learning to learn a model of the errors made by students in an introductory course. Then, we use the model to produce a ranked list of likely error locations in new programs. Our models can be trained on a modest amount of data, e.g. a single instance of a course, and we envision a future where each introductory course is accompanied by a model of its students’ errors.
To better explain the error to novice users, we present a runtime error that the type system would have prevented. We interleave type-checking and execution to search for a set of program inputs that would lead execution to a bad state, and present the execution trace to the user in an interactive debugger. This allows the user to explore why their program was rejected, and connects the dynamic (runtime) semantics to the static (typing) semantics.
We have evaluated our techniques empirically using a new dataset of ill-typed student programs collected from two instances of an undergraduate programming languages course at UC San Diego. We have also performed user studies with novice users, comparing the output of our techniques with the state of the art in type error diagnosis. Our results show that these are practical, lightweight techniques for improving the error messages produced by type checkers.