Using KCCA for Japanese-English cross-language information retrieval and classification
Using KCCA for Japanese-English cross-language information retrieval and classification
Kernel Canonical Correlation Analysis (KCCA) is a method of correlating linear relationship between two variables in a kernel defined feature space. A machine learning algorithm based on KCCA is studied for cross-language information retrieval. We apply the algorithm in Japanese-English cross-language information retrieval. The results are quite encouraging and are significantly better than those obtained by other state of the art methods. Computational complexity is an important issue when applying KCCA to large dataset as in information retrieval. We experimentally evaluate several methods to alleviate the problem of applying KCCA to large datasets. We also investigate cross-language document classification using KCCA as well as other methods. Our results show that it is feasible to use a classifier learned in one language to classify the documents in other languages.
KCCA cross-language information retrieval algorithm Japanese English kernel
Li, Yaoyong
073211dd-f160-4e2b-b09a-a170d865140d
Shawe-Taylor, John
b1931d97-fdd0-4bc1-89bc-ec01648e928b
2005
Li, Yaoyong
073211dd-f160-4e2b-b09a-a170d865140d
Shawe-Taylor, John
b1931d97-fdd0-4bc1-89bc-ec01648e928b
Li, Yaoyong and Shawe-Taylor, John
(2005)
Using KCCA for Japanese-English cross-language information retrieval and classification.
Journal of Intelligent Information Systems, tba (tba).
Abstract
Kernel Canonical Correlation Analysis (KCCA) is a method of correlating linear relationship between two variables in a kernel defined feature space. A machine learning algorithm based on KCCA is studied for cross-language information retrieval. We apply the algorithm in Japanese-English cross-language information retrieval. The results are quite encouraging and are significantly better than those obtained by other state of the art methods. Computational complexity is an important issue when applying KCCA to large dataset as in information retrieval. We experimentally evaluate several methods to alleviate the problem of applying KCCA to large datasets. We also investigate cross-language document classification using KCCA as well as other methods. Our results show that it is feasible to use a classifier learned in one language to classify the documents in other languages.
Text
kcca2005.pdf
- Other
More information
Published date: 2005
Keywords:
KCCA cross-language information retrieval algorithm Japanese English kernel
Organisations:
Electronics & Computer Science
Identifiers
Local EPrints ID: 260786
URI: http://eprints.soton.ac.uk/id/eprint/260786
PURE UUID: 99aeedd9-0817-4d8d-b1b5-85fccd345504
Catalogue record
Date deposited: 21 Apr 2005
Last modified: 14 Mar 2024 06:43
Export record
Contributors
Author:
Yaoyong Li
Author:
John Shawe-Taylor
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics