Discovering cross-language links in Wikipedia through semantic relatedness
Discovering cross-language links in Wikipedia through semantic relatedness
Wikipedia is a large multilingual collection of interlinked articles, used and contributed by millions of users over the Internet, that provides editions in up to 283 languages. Two articles in different language versions of Wikipedia may have information on the exactly the same concept, in which case they are often connected through a cross-language link. However, many cross-language links are either missing or incorrect and this negatively affects both the readers of Wikipedia and multilingual information retrieval applications. In this paper, we propose WikiCL, an algorithm for discoverinrg cross-language links using the semantic relatedness of two articles derived from the Wikipedia graph structure. Our evaluation shows that we achieve comparable, and in some cases, better results than previous methods with much less computational time
Penta, Antonio
dd594010-25ac-4126-875c-20af78040c45
Quercini, Gianluca
b1fad0e5-cf57-450f-bd0e-b007bc29133d
Chantal, Reynaud
e1a4eab1-c997-4787-9afa-95c132c43c83
Shadbolt, Nigel
5c5acdf4-ad42-49b6-81fe-e9db58c2caf7
2012
Penta, Antonio
dd594010-25ac-4126-875c-20af78040c45
Quercini, Gianluca
b1fad0e5-cf57-450f-bd0e-b007bc29133d
Chantal, Reynaud
e1a4eab1-c997-4787-9afa-95c132c43c83
Shadbolt, Nigel
5c5acdf4-ad42-49b6-81fe-e9db58c2caf7
Penta, Antonio, Quercini, Gianluca, Chantal, Reynaud and Shadbolt, Nigel
(2012)
Discovering cross-language links in Wikipedia through semantic relatedness.
20th European Conference on Artificial Intelligence (ECAI 2012), Montpellier, France.
26 - 30 Aug 2012.
Record type:
Conference or Workshop Item
(Paper)
Abstract
Wikipedia is a large multilingual collection of interlinked articles, used and contributed by millions of users over the Internet, that provides editions in up to 283 languages. Two articles in different language versions of Wikipedia may have information on the exactly the same concept, in which case they are often connected through a cross-language link. However, many cross-language links are either missing or incorrect and this negatively affects both the readers of Wikipedia and multilingual information retrieval applications. In this paper, we propose WikiCL, an algorithm for discoverinrg cross-language links using the semantic relatedness of two articles derived from the Wikipedia graph structure. Our evaluation shows that we achieve comparable, and in some cases, better results than previous methods with much less computational time
This record has no associated files available for download.
More information
Published date: 2012
Venue - Dates:
20th European Conference on Artificial Intelligence (ECAI 2012), Montpellier, France, 2012-08-26 - 2012-08-30
Organisations:
Electronics & Computer Science
Identifiers
Local EPrints ID: 340145
URI: http://eprints.soton.ac.uk/id/eprint/340145
PURE UUID: 2f8c50ea-a5a9-4758-be71-cbc7fa40271c
Catalogue record
Date deposited: 13 Jun 2012 13:16
Last modified: 08 Jan 2022 00:09
Export record
Contributors
Author:
Antonio Penta
Author:
Gianluca Quercini
Author:
Reynaud Chantal
Author:
Nigel Shadbolt
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics