Estimating the Support of a High-Dimensional Distribution

Estimating the Support of a High-Dimensional Distribution

Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a “simple” subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1. We propose a method to approach this problem by trying to estimate a function f that is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabeled data.

1443-1471

Sch"olkopf, B.

054f1e13-e210-4def-a083-72100a953a6c

Platt, J.C.

505163ae-8890-4ae4-add5-2e0ec29077ec

Shawe-Taylor, J.S.

455c50d6-e793-4695-8808-ee67a1d29e0b

Smola, A.J.

86322260-e1c1-43ab-b415-e003c18a2fe0

Williamson, R.C.

d25ad96f-f423-4edd-ad48-d47dc90cec89

July 2001

Sch"olkopf, B.

054f1e13-e210-4def-a083-72100a953a6c

Platt, J.C.

505163ae-8890-4ae4-add5-2e0ec29077ec

Shawe-Taylor, J.S.

455c50d6-e793-4695-8808-ee67a1d29e0b

Smola, A.J.

86322260-e1c1-43ab-b415-e003c18a2fe0

Williamson, R.C.

d25ad96f-f423-4edd-ad48-d47dc90cec89

Sch"olkopf, B., Platt, J.C., Shawe-Taylor, J.S., Smola, A.J. and Williamson, R.C.
(2001)
Estimating the Support of a High-Dimensional Distribution.
*Neural Computation*, 13 (7), .

## Abstract

Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a “simple” subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1. We propose a method to approach this problem by trying to estimate a function f that is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabeled data.

Other

** TRONECLA.PS
- Other**
## More information

Published date: July 2001

Organisations:
Electronics & Computer Science

## Identifiers

Local EPrints ID: 259007

URI: http://eprints.soton.ac.uk/id/eprint/259007

PURE UUID: 3c8f4a3f-0182-40e9-a9d8-64d071cd00ff

## Catalogue record

Date deposited: 05 Mar 2004

Last modified: 16 Dec 2019 20:45

## Export record

## Contributors

Author:
B. Sch"olkopf

Author:
J.C. Platt

Author:
J.S. Shawe-Taylor

Author:
A.J. Smola

Author:
R.C. Williamson

## University divisions

## Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics