Text Categorization via Ellipsoid Separation
Text Categorization via Ellipsoid Separation
We present a new batch learning algorithm for text classification in the vector space of document representations. The algorithm uses ellipsoid separation in the feature space which leads to a semidefinite program. An approximation of the latent semantic feature extraction approach using Gram-Schmidt orthogonalization is used for the feature extraction. Preliminary results demonstrate some potential for the presented approach.
Text categorization, pattern separation, semidefinite programming, ellipsoid, latent semantic indexing, feature extraction, bag-of-words text representation, Gram-Schmidt orthogonalization.
Kharechko, Andriy
9dccd719-b3fd-4ff6-9b85-b329e31cba9e
Shawe-Taylor, John
b1931d97-fdd0-4bc1-89bc-ec01648e928b
Herbrich, Ralf
3024ba7e-f3a1-4187-8655-b7f163c7c733
Graepel, Thore
f01fa538-c0f8-4e36-bbcc-698366e73f39
2004
Kharechko, Andriy
9dccd719-b3fd-4ff6-9b85-b329e31cba9e
Shawe-Taylor, John
b1931d97-fdd0-4bc1-89bc-ec01648e928b
Herbrich, Ralf
3024ba7e-f3a1-4187-8655-b7f163c7c733
Graepel, Thore
f01fa538-c0f8-4e36-bbcc-698366e73f39
Kharechko, Andriy, Shawe-Taylor, John, Herbrich, Ralf and Graepel, Thore
(2004)
Text Categorization via Ellipsoid Separation.
Learning Methods for Text Understanding and Mining, Grenoble, France.
26 - 29 Jan 2004.
Record type:
Conference or Workshop Item
(Other)
Abstract
We present a new batch learning algorithm for text classification in the vector space of document representations. The algorithm uses ellipsoid separation in the feature space which leads to a semidefinite program. An approximation of the latent semantic feature extraction approach using Gram-Schmidt orthogonalization is used for the feature extraction. Preliminary results demonstrate some potential for the presented approach.
More information
Published date: 2004
Additional Information:
Event Dates: 26 - 29 January 2004
Venue - Dates:
Learning Methods for Text Understanding and Mining, Grenoble, France, 2004-01-26 - 2004-01-29
Keywords:
Text categorization, pattern separation, semidefinite programming, ellipsoid, latent semantic indexing, feature extraction, bag-of-words text representation, Gram-Schmidt orthogonalization.
Organisations:
Electronics & Computer Science
Identifiers
Local EPrints ID: 263412
URI: http://eprints.soton.ac.uk/id/eprint/263412
PURE UUID: 7c36eaa9-8f88-4a0b-a0f3-32aa2d548c29
Catalogue record
Date deposited: 12 Feb 2007
Last modified: 14 Mar 2024 07:31
Export record
Contributors
Author:
Andriy Kharechko
Author:
John Shawe-Taylor
Author:
Ralf Herbrich
Author:
Thore Graepel
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics