The University of Southampton
University of Southampton Institutional Repository

Adapting integrity enforcement techniques for data reconciliation

Adapting integrity enforcement techniques for data reconciliation
Adapting integrity enforcement techniques for data reconciliation
Integration of data sources opens up possibilities for new and valuable applications of data that cannot be supported by the individual sources alone. Unfortunately, many data integration projects are hindered by the inherent heterogeneities in the sources to be integrated. In particular, differences in the way that real world data is encoded within sources can cause a range of difficulties, not least of which is that the conflicting semantics may not be recognised until the integration project is well under way. Once identified, semantic conflicts of this kind are typically dealt with by configuring a data transformation engine, that can convert incoming data into the form required by the integrated system. However, determination of a complete and consistent set of data transformations for any given integration task is far from trivial. In this paper, we explore the potential application of techniques for integrity enforcement in supporting this process. We describe the design of a data reconciliation tool (LITCHI) based on these techniques that aims to assist taxonomists in the integration of biodiversity data sets. Our experiences have highlighted several limitations of integrity enforcement when applied to this real world problem, and we describe how we have overcome these in the design of our system.
data integration, data reconciliation, integrity constraints, integrity enforcement, biodiversity information
0306-4379
657-689
Embury, S.M.
d79a13a7-5700-43df-922a-e503afeccafe
Brandt, S.M.
aa133290-9dbb-4f87-be0d-4eb0db5f7d33
Robinson, J.S.
44327568-9728-46ec-89d4-f9fde6a88c63
Sutherland, I.
eab5eeba-8b29-4be1-ac96-02756c06ad7e
Bisby, F.A.
a4683493-90bd-40df-98d4-321059b6fb69
Gray, W.A.
2a1eb454-84e8-4d25-a556-8a208857af46
Jones, A.C.
ba948099-753c-49cc-a860-0aca42f5d49b
White, R.J.
ad383a08-4ca7-4864-9a48-70e60ac0909b
Embury, S.M.
d79a13a7-5700-43df-922a-e503afeccafe
Brandt, S.M.
aa133290-9dbb-4f87-be0d-4eb0db5f7d33
Robinson, J.S.
44327568-9728-46ec-89d4-f9fde6a88c63
Sutherland, I.
eab5eeba-8b29-4be1-ac96-02756c06ad7e
Bisby, F.A.
a4683493-90bd-40df-98d4-321059b6fb69
Gray, W.A.
2a1eb454-84e8-4d25-a556-8a208857af46
Jones, A.C.
ba948099-753c-49cc-a860-0aca42f5d49b
White, R.J.
ad383a08-4ca7-4864-9a48-70e60ac0909b

Embury, S.M., Brandt, S.M., Robinson, J.S., Sutherland, I., Bisby, F.A., Gray, W.A., Jones, A.C. and White, R.J. (2001) Adapting integrity enforcement techniques for data reconciliation. Information Systems, 26 (8), 657-689. (doi:10.1016/S0306-4379(01)00044-8).

Record type: Article

Abstract

Integration of data sources opens up possibilities for new and valuable applications of data that cannot be supported by the individual sources alone. Unfortunately, many data integration projects are hindered by the inherent heterogeneities in the sources to be integrated. In particular, differences in the way that real world data is encoded within sources can cause a range of difficulties, not least of which is that the conflicting semantics may not be recognised until the integration project is well under way. Once identified, semantic conflicts of this kind are typically dealt with by configuring a data transformation engine, that can convert incoming data into the form required by the integrated system. However, determination of a complete and consistent set of data transformations for any given integration task is far from trivial. In this paper, we explore the potential application of techniques for integrity enforcement in supporting this process. We describe the design of a data reconciliation tool (LITCHI) based on these techniques that aims to assist taxonomists in the integration of biodiversity data sets. Our experiences have highlighted several limitations of integrity enforcement when applied to this real world problem, and we describe how we have overcome these in the design of our system.

PDF
sdarticle.pdf - Other
Restricted to Registered users only
Download (904kB)
Request a copy

More information

Published date: December 2001
Keywords: data integration, data reconciliation, integrity constraints, integrity enforcement, biodiversity information

Identifiers

Local EPrints ID: 30261
URI: https://eprints.soton.ac.uk/id/eprint/30261
ISSN: 0306-4379
PURE UUID: c368fcf1-75c6-4c47-9949-058d8de9122e

Catalogue record

Date deposited: 11 May 2006
Last modified: 17 Jul 2017 15:55

Export record

Altmetrics

Contributors

Author: S.M. Embury
Author: S.M. Brandt
Author: J.S. Robinson
Author: I. Sutherland
Author: F.A. Bisby
Author: W.A. Gray
Author: A.C. Jones
Author: R.J. White

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of https://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×