We present a new approach for creating repositories of real software faults. We have developed a tool, the Automatic Fault IDentification Tool (AFID), that implements this approach. AFID records both a fault revealing test case and a faulty version of the source code for any crashing faults that the developer discovers and a fault correcting source code change for any crashing faults that the developer corrects. The test cases are a significant contribution, because they enable new research that explores the dynamic behaviors of the software faults. AFID uses an operating system level monitoring mechanism to monitor both the compilation and execution of the application. This technique makes it straightforward for AFID to support a wide range of programming languages and compilers.
We present our experience using AFID in a controlled case study and in a real development environment to collect software faults in the internal development of our group’s compiler. The case studies collected several real software faults and validated the basic approach. The longer term internal study revealed weaknesses in using the original version of AFID for real development. This experience led to a number of refinements to the tool for use in real software development. We have collected over 20 real software faults in large programs and continue to collect software faults.