The University of Southampton
University of Southampton Institutional Repository

Distributed human computation framework for linked data co-reference resolution

Distributed human computation framework for linked data co-reference resolution
Distributed human computation framework for linked data co-reference resolution
Distributed Human Computation (DHC) is a technique used to solve computational problems by incorporating the collaborative effort of a large number of humans. It is also a solution to AI-complete problems such as natural language processing. The Semantic Web with its root in AI is envisioned to be a decentralised world-wide information space for sharing machine-readable data with minimal integration costs. There are many research problems in the Semantic Web that are considered as AI-complete problems. An example is co-reference resolution, which involves determining whether different URIs refer to the same entity. This is considered to be a significant hurdle to overcome in the realisation of large-scale Semantic Web applications. In this paper, we propose a framework for building a DHC system on top of the Linked Data Cloud to solve various computational problems. To demonstrate the concept, we are focusing on handling the co-reference resolution in the Semantic Web when integrating distributed datasets. The traditional way to solve this problem is to design machine-learning algorithms. However, they are often computationally expensive, error-prone and do not scale. We designed a DHC system named iamResearcher, which solves the scientific publication author identity co-reference problem when integrating distributed bibliographic datasets. In our system, we aggregated 6 million bibliographic data from various publication repositories. Users can sign up to the system to audit and align their own publications, thus solving the co-reference problem in a distributed manner. The aggregated results are published to the Linked Data Cloud.
Linked Data, DHC, Crowd-sourcing, Co-reference
32-46
Yang, Yang
4f250291-4405-49b3-a662-eb9810e00415
Singh, Priyanka
9114f1a3-01e1-47d1-a62c-76ea537c764e
Yao, Jiadi
e07ea12e-212e-4628-92f1-169671c1707a
Au Yeung, Ching Man
c83390b1-d3a1-459e-8f09-01c81576e066
Zareian, Amir
bd43af8c-5109-470a-93c4-e8b7b987000c
Wang, Xiaowei
69bb7b78-673f-4f05-a244-5dbf9f7e5fa3
Cai, Zhonglun
dd8dd525-19a5-4792-a048-617340996afe
Salvadores, Manuel
c1822871-bf33-41cd-bf97-0e927ff74acc
Gibbins, Nicholas
98efd447-4aa7-411c-86d1-955a612eceac
Hall, Wendy
11f7f8db-854c-4481-b1ae-721a51d8790c
Shadbolt, Nigel
5c5acdf4-ad42-49b6-81fe-e9db58c2caf7
Yang, Yang
4f250291-4405-49b3-a662-eb9810e00415
Singh, Priyanka
9114f1a3-01e1-47d1-a62c-76ea537c764e
Yao, Jiadi
e07ea12e-212e-4628-92f1-169671c1707a
Au Yeung, Ching Man
c83390b1-d3a1-459e-8f09-01c81576e066
Zareian, Amir
bd43af8c-5109-470a-93c4-e8b7b987000c
Wang, Xiaowei
69bb7b78-673f-4f05-a244-5dbf9f7e5fa3
Cai, Zhonglun
dd8dd525-19a5-4792-a048-617340996afe
Salvadores, Manuel
c1822871-bf33-41cd-bf97-0e927ff74acc
Gibbins, Nicholas
98efd447-4aa7-411c-86d1-955a612eceac
Hall, Wendy
11f7f8db-854c-4481-b1ae-721a51d8790c
Shadbolt, Nigel
5c5acdf4-ad42-49b6-81fe-e9db58c2caf7

Yang, Yang, Singh, Priyanka, Yao, Jiadi, Au Yeung, Ching Man, Zareian, Amir, Wang, Xiaowei, Cai, Zhonglun, Salvadores, Manuel, Gibbins, Nicholas, Hall, Wendy and Shadbolt, Nigel (2011) Distributed human computation framework for linked data co-reference resolution. 8th Extended Semantic Web Conference, LECTURE NOTES IN COMPUTER SCIENCE (LNCS), Volume 6643, Herakilon, Greece. 29 May - 02 Jun 2011. pp. 32-46 .

Record type: Conference or Workshop Item (Paper)

Abstract

Distributed Human Computation (DHC) is a technique used to solve computational problems by incorporating the collaborative effort of a large number of humans. It is also a solution to AI-complete problems such as natural language processing. The Semantic Web with its root in AI is envisioned to be a decentralised world-wide information space for sharing machine-readable data with minimal integration costs. There are many research problems in the Semantic Web that are considered as AI-complete problems. An example is co-reference resolution, which involves determining whether different URIs refer to the same entity. This is considered to be a significant hurdle to overcome in the realisation of large-scale Semantic Web applications. In this paper, we propose a framework for building a DHC system on top of the Linked Data Cloud to solve various computational problems. To demonstrate the concept, we are focusing on handling the co-reference resolution in the Semantic Web when integrating distributed datasets. The traditional way to solve this problem is to design machine-learning algorithms. However, they are often computationally expensive, error-prone and do not scale. We designed a DHC system named iamResearcher, which solves the scientific publication author identity co-reference problem when integrating distributed bibliographic datasets. In our system, we aggregated 6 million bibliographic data from various publication repositories. Users can sign up to the system to audit and align their own publications, thus solving the co-reference problem in a distributed manner. The aggregated results are published to the Linked Data Cloud.

Text
paper_10.pdf - Version of Record
Download (908kB)
Slideshow
ESWC.pptx - Other
Download (49MB)

More information

Published date: 29 May 2011
Additional Information: Event Dates: 29th May - 2nd June 2011
Venue - Dates: 8th Extended Semantic Web Conference, LECTURE NOTES IN COMPUTER SCIENCE (LNCS), Volume 6643, Herakilon, Greece, 2011-05-29 - 2011-06-02
Keywords: Linked Data, DHC, Crowd-sourcing, Co-reference
Organisations: Web & Internet Science

Identifiers

Local EPrints ID: 272060
URI: https://eprints.soton.ac.uk/id/eprint/272060
PURE UUID: fdcbbc7e-dd38-4db9-bf05-e0715302afb8
ORCID for Nicholas Gibbins: ORCID iD orcid.org/0000-0002-6140-9956
ORCID for Wendy Hall: ORCID iD orcid.org/0000-0003-4327-7811

Catalogue record

Date deposited: 23 Feb 2011 23:06
Last modified: 24 Sep 2019 01:00

Export record

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of https://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×