The University of Southampton
University of Southampton Institutional Repository

Data quality assessment from provenance graphs

Data quality assessment from provenance graphs
Data quality assessment from provenance graphs
Provenance is a domain-independent means to represent what happened in an application, which can help verify data and infer data quality. Provenance patterns can manifest real-world phenomena such as a significant interest in a piece of content, providing an indication of its quality, or even issues such as undesirable interactions within a group of contributors. This paper presents an application-independent methodology for analyzing data based on the network metrics of provenance graphs to learn about such patterns and to relate them to data quality in an automated manner. Validating this method on the provenance records of CollabMap, an online crowdsourcing mapping application, we demonstrated an accuracy level of over 95% for the trust classification of data generated by the crowd therein.
provenance, analytics, network metrics, machine learning, data quality
Huynh, Trung Dong
ddea6cf3-5a82-4c99-8883-7c31cf22dd36
Ebden, Mark
f46be90b-365e-4ea3-909a-4b92e4287f68
Ramchurn, Sarvapali
1d62ae2a-a498-444e-912d-a6082d3aaea3
Roberts, Stephen
fef5d01c-92bd-44cf-93f0-923ec24f8875
Moreau, Luc
033c63dd-3fe9-4040-849f-dfccbe0406f8
Huynh, Trung Dong
ddea6cf3-5a82-4c99-8883-7c31cf22dd36
Ebden, Mark
f46be90b-365e-4ea3-909a-4b92e4287f68
Ramchurn, Sarvapali
1d62ae2a-a498-444e-912d-a6082d3aaea3
Roberts, Stephen
fef5d01c-92bd-44cf-93f0-923ec24f8875
Moreau, Luc
033c63dd-3fe9-4040-849f-dfccbe0406f8

Huynh, Trung Dong, Ebden, Mark, Ramchurn, Sarvapali, Roberts, Stephen and Moreau, Luc (2014) Data quality assessment from provenance graphs. Provenance Analytics 2014, Germany. 09 Jun 2014. 4 pp .

Record type: Conference or Workshop Item (Paper)

Abstract

Provenance is a domain-independent means to represent what happened in an application, which can help verify data and infer data quality. Provenance patterns can manifest real-world phenomena such as a significant interest in a piece of content, providing an indication of its quality, or even issues such as undesirable interactions within a group of contributors. This paper presents an application-independent methodology for analyzing data based on the network metrics of provenance graphs to learn about such patterns and to relate them to data quality in an automated manner. Validating this method on the provenance records of CollabMap, an online crowdsourcing mapping application, we demonstrated an accuracy level of over 95% for the trust classification of data generated by the crowd therein.

PDF
provanalytics.pdf - Accepted Manuscript
Download (249kB)

More information

Published date: 9 June 2014
Venue - Dates: Provenance Analytics 2014, Germany, 2014-06-09 - 2014-06-09
Keywords: provenance, analytics, network metrics, machine learning, data quality
Organisations: Web & Internet Science, Agents, Interactions & Complexity

Identifiers

Local EPrints ID: 365510
URI: https://eprints.soton.ac.uk/id/eprint/365510
PURE UUID: 17fcdf67-c898-4058-b238-a9a83a23fbd6
ORCID for Trung Dong Huynh: ORCID iD orcid.org/0000-0003-4937-2473
ORCID for Sarvapali Ramchurn: ORCID iD orcid.org/0000-0001-9686-4302
ORCID for Luc Moreau: ORCID iD orcid.org/0000-0002-3494-120X

Catalogue record

Date deposited: 07 Jun 2014 13:45
Last modified: 06 Jun 2018 13:04

Export record

Contributors

Author: Trung Dong Huynh ORCID iD
Author: Mark Ebden
Author: Sarvapali Ramchurn ORCID iD
Author: Stephen Roberts
Author: Luc Moreau ORCID iD

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of https://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×