We present a simple, agnostic active learning algorithm that works
for any hypothesis class of bounded VC dimension, and any data distribution.
Our algorithm extends a scheme of Cohn, Atlas, and Ladner to the agnostic
setting, by (1) reformulating it using a reduction to supervised learning and
(2) showing how to apply generalization bounds even for the non-i.i.d. samples
that result from selective sampling. We provide a general characterization of
the label complexity of our algorithm. This quantity is never more than the
usual PAC sample complexity of supervised learning, and is exponentially
smaller for some hypothesis classes and distributions. We also demonstrate
improvements experimentally.
Pre-2018 CSE ID: CS2007-0898