Further Results on the Margin Distribution
Further Results on the Margin Distribution
A number of results have bounded generalization error of a classifier in terms of its margin on the training points. There has been some debate about whether the minimum margin is the best measure of the distribution of training set margin values with which to estimate the generalization error. Freund and Schapire [7] have shown how a different function of the margin distribution can be used to bound the number of mistakes of an on-line learning algorithm for a perceptron, as well as an expected error bound. Shawe-Taylor and Cristianini [ 131 showed that a slight generalization of their construction can be used to give a pat style bound on the tail of the distribution of the generalization errors that arise from a given sample size when using threshold linear classifiers. We show that in the linear case the approach can be viewed as a change of kernel and that the algorithms arising from the approach are exactly those originally proposed by Cortes and Vapnik [4]. We generalise the basic result to function classes with bounded fat-shattering dimension and the Ii measure for slack variables which gives rise to Vapnik’s box constraint algorithm. Finally, application to regression is considered, which includes standard least squares as a special case.
1581131674
278-285
Association for Computing Machinery
Shawe-Taylor, John
b1931d97-fdd0-4bc1-89bc-ec01648e928b
Cristianini, Nello
091768cb-dfc6-4422-827d-f520fefc4b40
1999
Shawe-Taylor, John
b1931d97-fdd0-4bc1-89bc-ec01648e928b
Cristianini, Nello
091768cb-dfc6-4422-827d-f520fefc4b40
Shawe-Taylor, John and Cristianini, Nello
(1999)
Further Results on the Margin Distribution.
In Proceedings of COLT'99.
Association for Computing Machinery.
.
Record type:
Conference or Workshop Item
(Paper)
Abstract
A number of results have bounded generalization error of a classifier in terms of its margin on the training points. There has been some debate about whether the minimum margin is the best measure of the distribution of training set margin values with which to estimate the generalization error. Freund and Schapire [7] have shown how a different function of the margin distribution can be used to bound the number of mistakes of an on-line learning algorithm for a perceptron, as well as an expected error bound. Shawe-Taylor and Cristianini [ 131 showed that a slight generalization of their construction can be used to give a pat style bound on the tail of the distribution of the generalization errors that arise from a given sample size when using threshold linear classifiers. We show that in the linear case the approach can be viewed as a change of kernel and that the algorithms arising from the approach are exactly those originally proposed by Cortes and Vapnik [4]. We generalise the basic result to function classes with bounded fat-shattering dimension and the Ii measure for slack variables which gives rise to Vapnik’s box constraint algorithm. Finally, application to regression is considered, which includes standard least squares as a special case.
This record has no associated files available for download.
More information
Published date: 1999
Organisations:
Electronics & Computer Science
Identifiers
Local EPrints ID: 259651
URI: http://eprints.soton.ac.uk/id/eprint/259651
ISBN: 1581131674
PURE UUID: 586fa9a2-0f84-4526-91bf-3617e1633a4b
Catalogue record
Date deposited: 12 Aug 2004
Last modified: 08 Dec 2023 17:33
Export record
Contributors
Author:
John Shawe-Taylor
Author:
Nello Cristianini
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics