In this work, we propose and evaluate an active learning algorithm in context of CPSGrader, an automatic grading and feedback generation tool for laboratory-based courses in the area of cyber-physical systems. CPSGrader detects the presence of certain classes of mistakes using test benches that are generated in part via machine learning from solutions that have the fault and those that do not (positive and negative examples). We develop a clustering-based active learning technique that selects from a large database of unlabeled solutions, a small number of reference solutions for the expert to label that will be used as training data. The goal is to achieve better accuracy of fault identification with fewer reference solutions as compared to random selection. We demonstrate the effectiveness of our algorithm using data obtained from an on-campus laboratory-based course at UC Berkeley.