UC Santa Cruz
Software Selection for Reliability Optimization using Time Series Analysis and Machine Learning
- Author(s): Mu, Yali
- Advisor(s): Desa, Subhas
- et al.
A software product family generally has multiple product lines or software versions, each with, in general, a different functionality. Also with new software versions, functionality usually measured by the number of features, in general, increases. The availability of several versions of a software family raises several important technical issues related to trade-off between software functionality and software reliability.
In this thesis, we consider the following issue: given a family of different versions of the same software product, how does one select versions of the software that maximize reliability, are measured by the number of software bugs. To resolve this issue, we develop a framework integrating time series analysis and machine learning to classify software versions of a given family into clusters based on user-specified reliability metrics. This classification or separation then enables the user to optimize the selection of software for maximum reliability.
Time series analysis, in particular is used to predict the future evolution of bugs. Machine learning methods, in particular Expectation-Maximization clustering and K-means clustering are applied to prediction-based metrics to classify software into clusters. We demonstrate the application of this framework to reliability prediction, classification and optimization of network IOS products.