Text Categorization via Ellipsoid Separation


Kharechko, Andriy, Shawe-Taylor, John, Herbrich, Ralf and Graepel, Thore (2004) Text Categorization via Ellipsoid Separation. At Postgraduate Research Conference in Electronics, Photonics, Communications & Networks, and Computing Science (PREP2004), University of Hertfordshire, Hatfield, UK, 05 - 07 Apr 2004. EPSRC, 19-20.

Download

[img] Microsoft Word
Download (51Kb)

Description/Abstract

The problem of document classification based on their semantic content (text categorization) arises when the documents from some set have to be ranked according to their relevance to some usually predefined set of topics (i.e. web search, classification of news articles based on their dealing with business topics). In this work we are going to present a new batch learning algorithm for text classification. Our method applies non-linear ellipsoid separation to the vector space representation of text documents representations. We use `bag of words' vector representation of text documents, and maximal separation ratio method for pattern separation via ellipsoids [2] and kernel GSK algorithm [1] for feature extraction. Therefore we utilize maximization of the separation ratio and approximation of latent semantic feature extraction. We present some preliminary results which indicate high potential for the given approach.

Item Type: Conference or Workshop Item (Poster)
Additional Information: Event Dates: 5 - 7 April 2004
Keywords: Text categorization, ellipsoid separation, semidefinite programming, latent semantic indexing, feature extraction.
Divisions: Faculty of Physical and Applied Science > Electronics and Computer Science
Item ID: 263417
Date Deposited: 13 Feb 2007
Last Modified: 02 Mar 2012 12:59
Contributors: Kharechko, Andriy (Author)
Shawe-Taylor, John (Author)
Herbrich, Ralf (Author)
Graepel, Thore (Author)
Date: 2004
Additional Information: Event Dates: 5 - 7 April 2004
Status: Published
Publisher: EPSRC
Further Information:Google Scholar
URI: http://eprints.soton.ac.uk/id/eprint/263417

Actions (login required)

View Item View Item