The University of Southampton
University of Southampton Institutional Repository

Extracting latent structures in numerical classification: An investigation using two factor models

Extracting latent structures in numerical classification: An investigation using two factor models
Extracting latent structures in numerical classification: An investigation using two factor models
We investigate the use of SVD based two factor models for numerical data classification. Motivations for such a study include the widespread success of such models (e.g, LSI) in textual information retrieval, emerging connections with well established statistical techniques and the increasing occurrence of mixed mode (text-and-numeric) data.
A direct extension as well as an efficient modification of the LSI model applied to numerical data problems are presented and the associated problems and likely remedies discussed. The techniques under investigation are shown to perform competitively with respect to popular existing numerical classification techniques on a range of synthetic and real world benchmark data. In particular, we show that the modified LSI proposed in this work avoids confronting the optimal subspace selection problem yet generalizes well and remains computationally efficient for large data.
1842-1846
Choudhury, Arindum
a043bfbe-e1b2-4fa5-abce-f24e772bb25e
Ong, YewSoon
6d0b3024-4ad1-4e17-aad1-5820eb62002b
Keane, Andy J.
26d7fa33-5415-4910-89d8-fb3620413def
Choudhury, Arindum
a043bfbe-e1b2-4fa5-abce-f24e772bb25e
Ong, YewSoon
6d0b3024-4ad1-4e17-aad1-5820eb62002b
Keane, Andy J.
26d7fa33-5415-4910-89d8-fb3620413def

Choudhury, Arindum, Ong, YewSoon and Keane, Andy J. (2002) Extracting latent structures in numerical classification: An investigation using two factor models. 9th International Conference on Neural Information Processing. ICONIP'02. 18 - 22 Nov 2002. pp. 1842-1846 .

Record type: Conference or Workshop Item (Paper)

Abstract

We investigate the use of SVD based two factor models for numerical data classification. Motivations for such a study include the widespread success of such models (e.g, LSI) in textual information retrieval, emerging connections with well established statistical techniques and the increasing occurrence of mixed mode (text-and-numeric) data.
A direct extension as well as an efficient modification of the LSI model applied to numerical data problems are presented and the associated problems and likely remedies discussed. The techniques under investigation are shown to perform competitively with respect to popular existing numerical classification techniques on a range of synthetic and real world benchmark data. In particular, we show that the modified LSI proposed in this work avoids confronting the optimal subspace selection problem yet generalizes well and remains computationally efficient for large data.

Text
chou_02a.pdf - Accepted Manuscript
Download (1MB)

More information

Published date: 2002
Venue - Dates: 9th International Conference on Neural Information Processing. ICONIP'02, 2002-11-18 - 2002-11-22

Identifiers

Local EPrints ID: 22259
URI: http://eprints.soton.ac.uk/id/eprint/22259
PURE UUID: dd049725-ffa0-4824-a5fd-9bb5003ae1f7

Catalogue record

Date deposited: 10 Jul 2006
Last modified: 19 Jul 2019 19:11

Export record

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×