Neutral zone classifiers allow for a region of neutrality when there is inadequate information to assign a predicted class label with suitable confidence. A neutral zone classifier is defined by classification regions that trade off the cost of an incorrect classification against the cost of remaining neutral. We derive a Bayes neutral zone classifier and demonstrate that it outperforms previous neutral zone classifiers with respect to the expected cost of misclassifications and also with respect to computational complexity. Additionally, we present the scenarios where the previous neutral zone classifiers and the proposed Bayes neutral zone classifier achieve equivalence in both the two-class and three-class setting.
Following the theoretical derivation of the Bayes neutral zone classifier we extend the methodology to both the unsupervised and semi-supervised setting via the EM algorithm for the purpose of developing neutral zone classifiers beyond the supervised setting. Previous versions of neutral zone classifiers have only dealt with the supervised settings. The discussion of unsupervised and semi-supervised neutral zone classifiers covers both the parametric and nonparametric cases. Simulation studies in both the parametric and nonparametric cases show the improvements that can be obtained by adding labeled data for semi-supervised learning.
The Bayes neutral zone classifier is illustrated with a microbial community profiling application in which no training data is available. In this example we show the benefits obtained over previous neutral zone classifiers. Additionally, a simulation study is performed to investigate the benefits of using neutral zone classification to remove noise from microbial community profiling data sets.