The University of Southampton
University of Southampton Institutional Repository

When humans and machines collaborate: cross-lingual label editing in Wikidata

When humans and machines collaborate: cross-lingual label editing in Wikidata
When humans and machines collaborate: cross-lingual label editing in Wikidata
The quality and maintainability of a knowledge graph are determined by the process in which it is created. There are different approaches to such processes; extraction or conversion of available data in the web (automated extraction of knowledge such as DBpedia from Wikipedia), community created knowledge graphs, often by a group of experts, and hybrid approaches where humans maintain the knowledge graph alongside bots. We focus in this work on the hybrid approach of human edited knowledge graphs supported by automated tools. In particular, we analyse the editing of natural language data, i.e. labels. Labels are the entry point for humans to understand the information, and therefore need to be carefully maintained. We take a step toward the understanding of collaborative editing of humans and automated tools across languages in a knowledge graph. We use Wikidata as it has a large and active community of humans and bots working together covering over 300 languages. In this work, we analyse the different editor groups and how they interact with the different language data to understand the provenance of the current label data.
Wikidata, Multilinguality, Community
Association for Computing Machinery
Kaffee, Lucie-Aimée
8975c12f-9033-47ed-a2eb-b674b707c2ac
Endris, Kemele M.
c75f39b9-262a-482c-9ec5-1ba96b19bb2d
Simperl, Elena
40261ae4-c58c-48e4-b78b-5187b10e4f67
Kaffee, Lucie-Aimée
8975c12f-9033-47ed-a2eb-b674b707c2ac
Endris, Kemele M.
c75f39b9-262a-482c-9ec5-1ba96b19bb2d
Simperl, Elena
40261ae4-c58c-48e4-b78b-5187b10e4f67

Kaffee, Lucie-Aimée, Endris, Kemele M. and Simperl, Elena (2019) When humans and machines collaborate: cross-lingual label editing in Wikidata. In OpenSym '19 Proceedings of the 15th International Symposium on Open Collaboration. Association for Computing Machinery. 9 pp . (doi:10.1145/3306446.3340826).

Record type: Conference or Workshop Item (Paper)

Abstract

The quality and maintainability of a knowledge graph are determined by the process in which it is created. There are different approaches to such processes; extraction or conversion of available data in the web (automated extraction of knowledge such as DBpedia from Wikipedia), community created knowledge graphs, often by a group of experts, and hybrid approaches where humans maintain the knowledge graph alongside bots. We focus in this work on the hybrid approach of human edited knowledge graphs supported by automated tools. In particular, we analyse the editing of natural language data, i.e. labels. Labels are the entry point for humans to understand the information, and therefore need to be carefully maintained. We take a step toward the understanding of collaborative editing of humans and automated tools across languages in a knowledge graph. We use Wikidata as it has a large and active community of humans and bots working together covering over 300 languages. In this work, we analyse the different editor groups and how they interact with the different language data to understand the provenance of the current label data.

Text
Cross-lingual Label Editing in Wikidata - Version of Record
Available under License Creative Commons Attribution.
Download (2MB)

More information

Published date: 20 August 2019
Venue - Dates: 15th International Symposium on Open Collaboration, , Skövde, Sweden, 2019-08-20 - 2019-08-22
Keywords: Wikidata, Multilinguality, Community

Identifiers

Local EPrints ID: 433768
URI: http://eprints.soton.ac.uk/id/eprint/433768
PURE UUID: 4cec75d2-6e9d-4997-82e9-d28de593b6ce
ORCID for Lucie-Aimée Kaffee: ORCID iD orcid.org/0000-0002-1514-8505
ORCID for Elena Simperl: ORCID iD orcid.org/0000-0003-1722-947X

Catalogue record

Date deposited: 03 Sep 2019 16:30
Last modified: 16 Mar 2024 03:51

Export record

Altmetrics

Contributors

Author: Lucie-Aimée Kaffee ORCID iD
Author: Kemele M. Endris
Author: Elena Simperl ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×