The University of Southampton
University of Southampton Institutional Repository

Text Categorization via Ellipsoid Separation

Text Categorization via Ellipsoid Separation
Text Categorization via Ellipsoid Separation
We present a new batch learning algorithm for text classification in the vector space of document representations. The algorithm uses ellipsoid separation in the feature space which leads to a semidefinite program. An approximation of the latent semantic feature extraction approach using Gram-Schmidt orthogonalization is used for the feature extraction. Preliminary results demonstrate some potential for the presented approach.
Text categorization, pattern separation, semidefinite programming, ellipsoid, latent semantic indexing, feature extraction, bag-of-words text representation, Gram-Schmidt orthogonalization.
Kharechko, Andriy
9dccd719-b3fd-4ff6-9b85-b329e31cba9e
Shawe-Taylor, John
b1931d97-fdd0-4bc1-89bc-ec01648e928b
Herbrich, Ralf
3024ba7e-f3a1-4187-8655-b7f163c7c733
Graepel, Thore
f01fa538-c0f8-4e36-bbcc-698366e73f39
Kharechko, Andriy
9dccd719-b3fd-4ff6-9b85-b329e31cba9e
Shawe-Taylor, John
b1931d97-fdd0-4bc1-89bc-ec01648e928b
Herbrich, Ralf
3024ba7e-f3a1-4187-8655-b7f163c7c733
Graepel, Thore
f01fa538-c0f8-4e36-bbcc-698366e73f39

Kharechko, Andriy, Shawe-Taylor, John, Herbrich, Ralf and Graepel, Thore (2004) Text Categorization via Ellipsoid Separation. Learning Methods for Text Understanding and Mining, Grenoble, France. 26 - 29 Jan 2004.

Record type: Conference or Workshop Item (Other)

Abstract

We present a new batch learning algorithm for text classification in the vector space of document representations. The algorithm uses ellipsoid separation in the feature space which leads to a semidefinite program. An approximation of the latent semantic feature extraction approach using Gram-Schmidt orthogonalization is used for the feature extraction. Preliminary results demonstrate some potential for the presented approach.

Text
fv17.pdf - Other
Download (187kB)

More information

Published date: 2004
Additional Information: Event Dates: 26 - 29 January 2004
Venue - Dates: Learning Methods for Text Understanding and Mining, Grenoble, France, 2004-01-26 - 2004-01-29
Keywords: Text categorization, pattern separation, semidefinite programming, ellipsoid, latent semantic indexing, feature extraction, bag-of-words text representation, Gram-Schmidt orthogonalization.
Organisations: Electronics & Computer Science

Identifiers

Local EPrints ID: 263412
URI: http://eprints.soton.ac.uk/id/eprint/263412
PURE UUID: 7c36eaa9-8f88-4a0b-a0f3-32aa2d548c29

Catalogue record

Date deposited: 12 Feb 2007
Last modified: 14 Mar 2024 07:31

Export record

Contributors

Author: Andriy Kharechko
Author: John Shawe-Taylor
Author: Ralf Herbrich
Author: Thore Graepel

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×