The University of Southampton
University of Southampton Institutional Repository

Making sense of numerical data - semantic labelling of web tables

Making sense of numerical data - semantic labelling of web tables
Making sense of numerical data - semantic labelling of web tables
With the increasing amount of structured data on the web the need to understand and support search over this emerging data space is growing. Adding semantics to structured data can help address existing challenges in data discovery, as it facilitates understanding the values in their context. While there are approaches on how to lift structured data to semantic web formats to enrich it and facilitate discovery, most work to date focuses on textual fields rather than numerical data. In this paper, we propose a two level (row and column based) approach to add semantic meaning to numerical values in tables, called NUMER. We evaluate our approach using a benchmark (NumDB) generated for the purpose of this work. We show the influence of the different levels of analysis on the success of assigning semantic labels to numerical values in tables. Our approach outperforms the state of the art and is less affected by data structure and quality issues such as a small number of entities or deviations in the data.
Semantic Labelling, Numerical Values, Linked Data
1611-3349
11313
Springer
Kacprzak, Emilia, Magdalena
fdc38ad7-6879-4769-ad65-5d3582690af2
Gimenez-Garcia, Jose M.
89a6ecdd-57a9-4954-80d1-737f85994027
Piscopo, Alessandro
c4a3c65a-bd85-4bfa-926b-8a2228da127d
Koesten, Laura, Mylena
a3426c32-31d1-47a4-b500-f237b4e74084
Ibanez Gonzalez, Luis
65a2e20b-74a9-427d-8c4c-2330285153ed
Tennison, Jeni
abfdd103-6089-427d-babb-56448595f2fa
Simperl, Elena
40261ae4-c58c-48e4-b78b-5187b10e4f67
Kacprzak, Emilia, Magdalena
fdc38ad7-6879-4769-ad65-5d3582690af2
Gimenez-Garcia, Jose M.
89a6ecdd-57a9-4954-80d1-737f85994027
Piscopo, Alessandro
c4a3c65a-bd85-4bfa-926b-8a2228da127d
Koesten, Laura, Mylena
a3426c32-31d1-47a4-b500-f237b4e74084
Ibanez Gonzalez, Luis
65a2e20b-74a9-427d-8c4c-2330285153ed
Tennison, Jeni
abfdd103-6089-427d-babb-56448595f2fa
Simperl, Elena
40261ae4-c58c-48e4-b78b-5187b10e4f67

Kacprzak, Emilia, Magdalena, Gimenez-Garcia, Jose M., Piscopo, Alessandro, Koesten, Laura, Mylena, Ibanez Gonzalez, Luis, Tennison, Jeni and Simperl, Elena (2018) Making sense of numerical data - semantic labelling of web tables. In Knowledge Engineering and Knowledge Management. Springer. 15 pp . (doi:10.1007/978-3-030-03667-6_11).

Record type: Conference or Workshop Item (Paper)

Abstract

With the increasing amount of structured data on the web the need to understand and support search over this emerging data space is growing. Adding semantics to structured data can help address existing challenges in data discovery, as it facilitates understanding the values in their context. While there are approaches on how to lift structured data to semantic web formats to enrich it and facilitate discovery, most work to date focuses on textual fields rather than numerical data. In this paper, we propose a two level (row and column based) approach to add semantic meaning to numerical values in tables, called NUMER. We evaluate our approach using a benchmark (NumDB) generated for the purpose of this work. We show the influence of the different levels of analysis on the success of assigning semantic labels to numerical values in tables. Our approach outperforms the state of the art and is less affected by data structure and quality issues such as a small number of entities or deviations in the data.

Text
numerical-kacprzak - Accepted Manuscript
Download (387kB)

More information

Accepted/In Press date: 13 October 2018
e-pub ahead of print date: 31 October 2018
Published date: 30 November 2018
Keywords: Semantic Labelling, Numerical Values, Linked Data

Identifiers

Local EPrints ID: 426723
URI: http://eprints.soton.ac.uk/id/eprint/426723
ISSN: 1611-3349
PURE UUID: fe4b03aa-867f-43ce-bd85-8a5e9a11f4d8
ORCID for Alessandro Piscopo: ORCID iD orcid.org/0000-0002-0362-4826
ORCID for Luis Ibanez Gonzalez: ORCID iD orcid.org/0000-0001-6993-0001
ORCID for Elena Simperl: ORCID iD orcid.org/0000-0003-1722-947X

Catalogue record

Date deposited: 11 Dec 2018 17:30
Last modified: 16 Mar 2024 07:24

Export record

Altmetrics

Contributors

Author: Emilia, Magdalena Kacprzak
Author: Jose M. Gimenez-Garcia
Author: Alessandro Piscopo ORCID iD
Author: Laura, Mylena Koesten
Author: Luis Ibanez Gonzalez ORCID iD
Author: Jeni Tennison
Author: Elena Simperl ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×