The University of Southampton
University of Southampton Institutional Repository

NILK: entity linking dataset targeting NIL-linking cases

NILK: entity linking dataset targeting NIL-linking cases
NILK: entity linking dataset targeting NIL-linking cases
The NIL-linking task in Entity Linking deals with cases where the text mentions do not have a corresponding entity in the associated knowledge base. NIL-linking has two sub-tasks: NIL-detection and NIL-disambiguation. NIL-detection identifies NIL-mentions in the text. Then, NIL-disambiguation determines if some NIL-mentions refer to the same out-of-knowledge base entity. Although multiple existing datasets can be adapted for NIL-detection, none of them address the problem of NIL-disambiguation. This paper presents NILK, a new dataset for NIL-linking processing, constructed from WikiData and Wikipedia dumps from two different timestamps. The NILK dataset has two main features: 1) It marks NIL-mentions for NIL-detection by extracting mentions which belong to newly added entities in Wikipedia text. 2) It provides an entity label for NIL-disambiguation by marking NIL-mentions with WikiData IDs from the newer dump. We make available the annotated dataset along with the code1. The NILK dataset is available at: https://zenodo.org/record/66075142.
Iurshina, Anastasiia
953cc079-571a-41c4-84be-0c97943d4ef3
Boutalbi, Rafika
a03728b9-e89a-47ab-b2d2-d1cfd943e593
Pan, Jiaxin
b9d70726-a4ee-4bc3-b334-5af55068c7be
Staab, Steffen
bf48d51b-bd11-4d58-8e1c-4e6e03b30c49
Iurshina, Anastasiia
953cc079-571a-41c4-84be-0c97943d4ef3
Boutalbi, Rafika
a03728b9-e89a-47ab-b2d2-d1cfd943e593
Pan, Jiaxin
b9d70726-a4ee-4bc3-b334-5af55068c7be
Staab, Steffen
bf48d51b-bd11-4d58-8e1c-4e6e03b30c49

Iurshina, Anastasiia, Boutalbi, Rafika, Pan, Jiaxin and Staab, Steffen (2022) NILK: entity linking dataset targeting NIL-linking cases. International Conference on Information and Knowledge Management, , Atlanta, United States. 17 - 21 Oct 2022. 5 pp . (In Press)

Record type: Conference or Workshop Item (Paper)

Abstract

The NIL-linking task in Entity Linking deals with cases where the text mentions do not have a corresponding entity in the associated knowledge base. NIL-linking has two sub-tasks: NIL-detection and NIL-disambiguation. NIL-detection identifies NIL-mentions in the text. Then, NIL-disambiguation determines if some NIL-mentions refer to the same out-of-knowledge base entity. Although multiple existing datasets can be adapted for NIL-detection, none of them address the problem of NIL-disambiguation. This paper presents NILK, a new dataset for NIL-linking processing, constructed from WikiData and Wikipedia dumps from two different timestamps. The NILK dataset has two main features: 1) It marks NIL-mentions for NIL-detection by extracting mentions which belong to newly added entities in Wikipedia text. 2) It provides an entity label for NIL-disambiguation by marking NIL-mentions with WikiData IDs from the newer dump. We make available the annotated dataset along with the code1. The NILK dataset is available at: https://zenodo.org/record/66075142.

Text
sp0935-iurshina - Accepted Manuscript
Restricted to Repository staff only
Request a copy

More information

Accepted/In Press date: 4 August 2022
Venue - Dates: International Conference on Information and Knowledge Management, , Atlanta, United States, 2022-10-17 - 2022-10-21

Identifiers

Local EPrints ID: 470716
URI: http://eprints.soton.ac.uk/id/eprint/470716
PURE UUID: a8b87ca0-d6cf-41ab-a27e-5d4fded184d9
ORCID for Steffen Staab: ORCID iD orcid.org/0000-0002-0780-4154

Catalogue record

Date deposited: 18 Oct 2022 17:06
Last modified: 17 Mar 2024 03:38

Export record

Contributors

Author: Anastasiia Iurshina
Author: Rafika Boutalbi
Author: Jiaxin Pan
Author: Steffen Staab ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×