In “A Sequential Algorithm for Training Text Classifiers” by David D.
Lewis and William Gale, the authors put forth a new (at the time)
method training text classifiers using an approach they call
“uncertainty sampling”

Section 1 outlines the problem of training, namely obtaining a good
sample of text to be labeled for the trainer.  After disposing of
several other methods of garnering samples (random, relevance
feedback based), Lewis and Gale introduce an iterative approach for
manually labeling examples.

Section 2 then discusses the benefits of “learning by query” in
theory, namely the possibility of reducing the error rate very
quickly in comparison to the number of queries required.

Figure 1 (described in section 3) outlines their basic approach,
which relies on having a human judge some subset of examples that the
currently used classifier is least certain about.  This process is
iterated until the human feels satisfied with the results.  One
caveat of this approach is that the classifier must not only predict
the class, it must give a measurement of certainty for that class.

Continuing on into section 4, we are introduced to how to build a
classifier and use uncertainty sampling to train it.  Most of the
section details the probability theory behind it, finishing up with
how to do the sampling.  One thing I always wish for in these papers
are concrete examples (maybe as an appendix or a reference) that work
through the math on an actual toy problem.  Section 5 does just this,
laying out an experiment and discussing the details, minus the math,
which probably suits most people just fine.

Section 7 has an excellent discussion of the results, the pay dirt
being that using this new method significantly reduces the number of
examples required for training, at the cost of having a human in the
loop.

Popularity: 21% [?]