The University of Southampton
University of Southampton Institutional Repository

Provenance-based reproducibility in the Semantic Web

Provenance-based reproducibility in the Semantic Web
Provenance-based reproducibility in the Semantic Web
Reproducibility is a crucial property of data since it allows users to understand and verify how data was derived, and therefore allows them to put their trust in such data. Reproducibility is essential for science, because the reproducibility of experimental results is a tenet of the scientific method, but reproducibility is also beneficial in many other fields, including automated decision making, visualization, and automated data feeds. To achieve the vision of reproducibility, the workflow-based community has strongly advocated the use of provenance as an underpinning mechanism for reproducibility, since a rich representation of provenance allows steps to be reproduced and all intermediary and final results checked and validated. Concurrently, multiple ontology-based representations of provenance have been devised, to be able to describe past computations, uniformly across a variety of technologies. However, such Semantic Web representations of provenance do not have any formal link with execution. Even assuming a faithful and non-malicious environment, how can we claim that an ontology-based representation of provenance enables reproducibility, since it has not been given any execution semantics, and therefore has no formal way of expressing the reproduction of computations? This is the problem that this paper tackles by defining a denotational semantics for the Open Provenance Model, which is referred to as the reproducibility semantics. This semantics is used to implement a reproducibility service, leveraging multiple Semantic Web technologies, and offering a variety of reproducibility approaches, found in the literature. A series of empirical experiments were designed to exhibit the range of reproducibility capabilities of our approach; in particular, we demonstrate the ability to reproduce computations involving multiple technologies, as is commonly found on the Web.
1570-8268
202-221
Moreau, Luc
033c63dd-3fe9-4040-849f-dfccbe0406f8
Moreau, Luc
033c63dd-3fe9-4040-849f-dfccbe0406f8

Moreau, Luc (2011) Provenance-based reproducibility in the Semantic Web. Web Semantics: Science Services and Agents on the World Wide Web, 9 (2), 202-221. (doi:10.1016/j.websem.2011.03.001).

Record type: Article

Abstract

Reproducibility is a crucial property of data since it allows users to understand and verify how data was derived, and therefore allows them to put their trust in such data. Reproducibility is essential for science, because the reproducibility of experimental results is a tenet of the scientific method, but reproducibility is also beneficial in many other fields, including automated decision making, visualization, and automated data feeds. To achieve the vision of reproducibility, the workflow-based community has strongly advocated the use of provenance as an underpinning mechanism for reproducibility, since a rich representation of provenance allows steps to be reproduced and all intermediary and final results checked and validated. Concurrently, multiple ontology-based representations of provenance have been devised, to be able to describe past computations, uniformly across a variety of technologies. However, such Semantic Web representations of provenance do not have any formal link with execution. Even assuming a faithful and non-malicious environment, how can we claim that an ontology-based representation of provenance enables reproducibility, since it has not been given any execution semantics, and therefore has no formal way of expressing the reproduction of computations? This is the problem that this paper tackles by defining a denotational semantics for the Open Provenance Model, which is referred to as the reproducibility semantics. This semantics is used to implement a reproducibility service, leveraging multiple Semantic Web technologies, and offering a variety of reproducibility approaches, found in the literature. A series of empirical experiments were designed to exhibit the range of reproducibility capabilities of our approach; in particular, we demonstrate the ability to reproduce computations involving multiple technologies, as is commonly found on the Web.

Text
reproducibility.pdf - Author's Original
Download (439kB)
Text
reproducibility.pdf - Accepted Manuscript
Download (499kB)
Other
reprod6.sml - Other
Available under License Other.
Download (46kB)

More information

Submitted date: 16 September 2010
Published date: July 2011
Organisations: Web & Internet Science

Identifiers

Local EPrints ID: 271554
URI: http://eprints.soton.ac.uk/id/eprint/271554
ISSN: 1570-8268
PURE UUID: fc74666c-e9dd-464c-b150-927d5630d8f0
ORCID for Luc Moreau: ORCID iD orcid.org/0000-0002-3494-120X

Catalogue record

Date deposited: 16 Sep 2010 14:40
Last modified: 14 Mar 2024 09:33

Export record

Altmetrics

Contributors

Author: Luc Moreau ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×