The Set Covering Machine

Marchand, Mario and Shawe-Taylor, J. (2002) The Set Covering Machine. Journal of Machine Learning Research, 3 (4-5), 723-746.

Record type: Article

Abstract

We extend the classical algorithms of Valiant and Haussler for learning compact conjunctions and disjunctions of Boolean attributes to allow features that are constructed from the data and to allow a trade-off between accuracy and complexity. The result is a general purpose learning machine, suitable for practical learning tasks, that we call the set covering machine. We present a version of the set covering machine that uses data-dependent balls for its set of features and compare its performance with the support vector machine. By extending a technique pioneered by Littlestone and Warmuth, we bound its generalization error as a function of the amount of data compression it achieves during training. In experiments with real-world learning tasks, the bound is shown to be extremely tight and to provide an effective guide for model selection.

Text

marchand02a.pdf - Other

Download (231kB)