Information technology is now ubiquitous in higher education institutions worldwide. More than 85% of American universities use e-learning systems to supplement traditional classroom activities. An obvious benefit of these online tools is their ability to automatically grade exercises submitted by students and provide immediate feedback. Most of these systems, however, provide binary (correct/incorrect) feedback to students.
While some educators find such feedback is useful, we have found that binary instant feedback causes plagiarism and disengagement from the exercises as some students may need additional guidance in order to successfully overcome obstacles to understanding.
In an effort to address the shortcomings of binary feedback, we designed a Case-Based Reasoning (CBR) framework for generating detailed feedback on programming exercises by reusing existing knowledge provided by human instructors. A crucial component of the CBR framework is the ability to recognize incorrectness similarity between programs. Two programs are considered to be similarly incorrect, if they contain similar bugs, which ensures that corrective feedback generated for one program, is equally appropriate for the other.
We investigated several approaches for computing incorrectness similarity, including static analysis of source code, execution traces of running programs, and comparing outputs from test cases. We found that, given the kind of errors committed by our students, the dynamic approach of comparing outputs from test cases proved to be the most accurate method of computing incorrectness similarity.
We built an e-learning system, called Compass, on top of the CBR platform that we developed. Compass was deployed in a live classroom environment at the University of California, Merced, in the Spring 2017 semester. We compared data collected from this class to data from previous instances of the course, where students were completing the same exercises but received binary instant feedback.
We found that the introduction of Compass, and the detailed feedback it is able to generate on programming exercises, led to a statistically significant decrease in plagiarism and disengagement rates. In addition, we found that students were able to complete exercises faster, with fewer errors. All these factors are associated with improved student learning.
Another significant aspect of Compass is that it scales well to large class sizes. This is because the number of different mistakes made by students is relatively small and the number of students making the same mistake as other students is large. These two conditions enable the CBR engine of Compass to handle a large number of students with minimal instructor intervention.
Work is currently underway to incorporate Compass into other undergraduate courses at the University of California, Merced. As future work, we are planning to investigate the effects of Compass on underrepresented student populations. We have reasons to believe that Compass can provide much needed help to students who may lack confidence to seek such assistance on their own.