Flash memory is prone to failures as the number of program-erase cycles increase. These physical failures in the flash result in an increase in the bit error rates.
Once the bit error count exceeds a certain threshold the error correction engines are incapable of correcting the error without adversely impacting the system performance, or may even fail entirely. This leads to an interest in learning the behavior of error count increase and page failure in the flash memory and obtaining an ability to make failure predictions. We tackle this problem using a machine learning approach. However, standard machine learning techniques may not work well with the particular data in hand. This is because the error counts are collected from actual flash memory and one can expect to see more pages with a lower error count than pages with a higher error count. This feature of the dataset leads to a formulation of our goal in terms of a classification problem with significant class imbalance in the underlying data. We have investigated various classification methods that address such class imbalance. Among those considered are cost-sensitive boosting techniques, bagging procedures, bagging ensemble support vector machines (SVMs) and cost-sensitive neural networks.