Capturing interactive data transformation operations using provenance workflows
Capturing interactive data transformation operations using provenance workflows
The ready availability of data is leading to the increased opportunity of their re-use for new applications and for analyses. Most of these data are not necessarily in the format users want, are usually heterogeneous, and highly dynamic, and this necessitates data transformation efforts to re-purpose them. Interactive data transformation (IDT) tools are becoming easily available to lower these barriers to data trans- formation efforts. This paper describes a principled way to capture data lineage of interactive data transformation processes. We provide a formal model of IDT, its mapping to a provenance representation, and its implementation and validation on Google Refine. Provision of the data transformation process sequences allows assessment of data quality and ensures portability between IDT and other data transformation platforms. The proposed model showed a high level of coverage against a set of requirements used for evaluating systems that provide provenance management solutions.
Omitola, Tope
35ba4e4d-beec-4643-a152-995f8979867a
Freitas, Andre
c7a66eef-8f9d-4006-9d6c-cc75e6d6fe19
Edward, Curry
5f6a85f5-e499-42de-8ce8-386d5ee95a1d
O’Riain, Sean
f89998be-8eec-4ab2-8d5f-fd04f1197ee0
Gibbins, Nicholas
98efd447-4aa7-411c-86d1-955a612eceac
Shadbolt, Nigel
5c5acdf4-ad42-49b6-81fe-e9db58c2caf7
May 2012
Omitola, Tope
35ba4e4d-beec-4643-a152-995f8979867a
Freitas, Andre
c7a66eef-8f9d-4006-9d6c-cc75e6d6fe19
Edward, Curry
5f6a85f5-e499-42de-8ce8-386d5ee95a1d
O’Riain, Sean
f89998be-8eec-4ab2-8d5f-fd04f1197ee0
Gibbins, Nicholas
98efd447-4aa7-411c-86d1-955a612eceac
Shadbolt, Nigel
5c5acdf4-ad42-49b6-81fe-e9db58c2caf7
Omitola, Tope, Freitas, Andre, Edward, Curry, O’Riain, Sean, Gibbins, Nicholas and Shadbolt, Nigel
(2012)
Capturing interactive data transformation operations using provenance workflows.
The Third International Workshop on the Role of Semantic Web in Provenance Management (SWPM 2012), Heraklion, Greece.
27 - 28 May 2012.
12 pp
.
Record type:
Conference or Workshop Item
(Paper)
Abstract
The ready availability of data is leading to the increased opportunity of their re-use for new applications and for analyses. Most of these data are not necessarily in the format users want, are usually heterogeneous, and highly dynamic, and this necessitates data transformation efforts to re-purpose them. Interactive data transformation (IDT) tools are becoming easily available to lower these barriers to data trans- formation efforts. This paper describes a principled way to capture data lineage of interactive data transformation processes. We provide a formal model of IDT, its mapping to a provenance representation, and its implementation and validation on Google Refine. Provision of the data transformation process sequences allows assessment of data quality and ensures portability between IDT and other data transformation platforms. The proposed model showed a high level of coverage against a set of requirements used for evaluating systems that provide provenance management solutions.
Text
Omitola_Eswc2012_Provenance_Wkshop.pdf
- Author's Original
More information
Published date: May 2012
Venue - Dates:
The Third International Workshop on the Role of Semantic Web in Provenance Management (SWPM 2012), Heraklion, Greece, 2012-05-27 - 2012-05-28
Organisations:
Electronics & Computer Science
Identifiers
Local EPrints ID: 336970
URI: http://eprints.soton.ac.uk/id/eprint/336970
PURE UUID: b27827e6-671e-4140-99c8-95b118fe1062
Catalogue record
Date deposited: 12 Apr 2012 14:20
Last modified: 15 Mar 2024 03:00
Export record
Contributors
Author:
Tope Omitola
Author:
Andre Freitas
Author:
Curry Edward
Author:
Sean O’Riain
Author:
Nicholas Gibbins
Author:
Nigel Shadbolt
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics