An application of the nearest correlation matrix to Web document classification


Qi, Hou-Duo, Xia, Zhonghang and Xing, Guangming (2007) An application of the nearest correlation matrix to Web document classification. Journal of Industrial Management and Optimization, 3, (4), 701-713.

Download

Full text not available from this repository.

Description/Abstract

The Web document is organized by a set of textual data according to a predefined
logical structure. It has been shown that collecting Web documents with similar
structures can improve query efficiency. The XML document has no vectorial representation,
which is required in most existing classification algorithms. The kernel method has
been applied to represent structural data with pairwise similarity. In this case, a set of Web
data can be fed into classification algorithms in the format of a kernel matrix. However,
since the distance between a pair of Web documents is usually obtained approximately, the
derived distance matrix is not a kernel matrix. In this paper, we propose to use the nearest
correlation matrix (of the estimated distance matrix) as the kernel matrix, which can be
fast computed by a Newton-type method. Experimental studies show that the classification
accuracy can be significantly improved.

Item Type: Article
ISSNs: 1547-5816 (print)
Related URLs:
Keywords: support vector machines, classification, kernel matrix, semidefinite programming.
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: University Structure - Pre August 2011 > School of Mathematics > Operational Research
ePrint ID: 54536
Date Deposited: 28 Jul 2008
Last Modified: 27 Mar 2014 18:37
URI: http://eprints.soton.ac.uk/id/eprint/54536

Actions (login required)

View Item View Item