The Set Covering Machine
The Set Covering Machine
We extend the classical algorithms of Valiant and Haussler for learning compact conjunctions and disjunctions of Boolean attributes to allow features that are constructed from the data and to allow a trade-off between accuracy and complexity. The result is a general purpose learning machine, suitable for practical learning tasks, that we call the set covering machine. We present a version of the set covering machine that uses data-dependent balls for its set of features and compare its performance with the support vector machine. By extending a technique pioneered by Littlestone and Warmuth, we bound its generalization error as a function of the amount of data compression it achieves during training. In experiments with real-world learning tasks, the bound is shown to be extremely tight and to provide an effective guide for model selection.
723-746
Marchand, Mario
d95ee658-1b6b-4338-a366-d1b4aaf10b45
Shawe-Taylor, J.
c32d0ee4-b422-491f-8c28-78663851d6db
2002
Marchand, Mario
d95ee658-1b6b-4338-a366-d1b4aaf10b45
Shawe-Taylor, J.
c32d0ee4-b422-491f-8c28-78663851d6db
Marchand, Mario and Shawe-Taylor, J.
(2002)
The Set Covering Machine.
Journal of Machine Learning Research, 3 (4-5), .
Abstract
We extend the classical algorithms of Valiant and Haussler for learning compact conjunctions and disjunctions of Boolean attributes to allow features that are constructed from the data and to allow a trade-off between accuracy and complexity. The result is a general purpose learning machine, suitable for practical learning tasks, that we call the set covering machine. We present a version of the set covering machine that uses data-dependent balls for its set of features and compare its performance with the support vector machine. By extending a technique pioneered by Littlestone and Warmuth, we bound its generalization error as a function of the amount of data compression it achieves during training. In experiments with real-world learning tasks, the bound is shown to be extremely tight and to provide an effective guide for model selection.
Text
marchand02a.pdf
- Other
More information
Published date: 2002
Organisations:
Electronics & Computer Science
Identifiers
Local EPrints ID: 259011
URI: http://eprints.soton.ac.uk/id/eprint/259011
PURE UUID: 2914ecee-60fd-4c22-bd94-90813dce4f3a
Catalogue record
Date deposited: 05 Mar 2004
Last modified: 14 Mar 2024 06:18
Export record
Contributors
Author:
Mario Marchand
Author:
J. Shawe-Taylor
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics