UC Santa Cruz
Efficient Bug Prediction and Fix Suggestions
- Author(s): Shivaji, Shivkumar
- Advisor(s): Whitehead, E James
- et al.
Bugs are a well known Achilles' heel of software development. In the last few years, machine learning techniques to combat software bugs have become popular. However, results of these techniques are not good enough for practical adoption. In addition, most techniques do not provide reasons for why a code change is a bug.
Furthermore, suggestions to fix the bug would be greatly beneficial. An added bonus would be engaging humans to improve the bug and fix prediction process.
In this dissertation, a step-by-step procedure which effectively predicts buggy code changes (Bug Prognosticator), produces bug fix suggestions (Fix Suggester), and utilizes human feedback is presented. Each of these steps can be used independently, but combining them allows more effective management of bugs. These techniques are tested on many open source and a large commercial project. Human feedback was used to understand and improve the performance of the techniques. Feedback was primarily gathered from industry participants in order to assess practical suitability.
The Bug Prognosticator explores feature selection techniques and classifiers to improve results of code change bug prediction. The optimized Bug Prognosticator is able to achieve an average 97% precision and 70% recall when evaluated on eleven projects, ten open source and one commercial.
The Fix Suggester uses the Bug Prognosticator and statistical analysis of keyword term frequencies to suggest unordered fix keywords to a code change predicted to be buggy. The suggestions are validated against actual bug fixes to confirm their utility. The Fix Suggester is able to achieve 46.9% precision and 38.9% recall on its predicted fix tokens. This is a reasonable start to the difficult problem of predicting the contents of a bug fix.
To improve the efficiency of the Bug Prognosticator and the Fix Suggester, active learning is employed on willing human participants. Developers aid the Bug Prognosticator and the Fix Suggester on code changes that machines find hard to evaluate. The developer's feedback is used to enhance the performance of the Bug Prognosticator and the Fix Suggester. In addition, a user study is performed to gauge the utility of the Fix Suggester.
The dissertation concludes with a discussion of future work and challenges faced by the techniques. Given the success of statistical defect prediction techniques, more industrial exposure would benefit researchers and software practitioners.