Adapting integrity enforcement techniques for data
reconciliation
Adapting integrity enforcement techniques for data
reconciliation
Integration of data sources opens up possibilities for new and valuable applications of data that cannot be supported by the individual sources alone. Unfortunately, many data integration projects are hindered by the inherent heterogeneities in the sources to be integrated. In particular, differences in the way that real world data is encoded within sources can cause a range of difficulties, not least of which is that the conflicting semantics may not be recognised until the integration project is well under way. Once identified, semantic conflicts of this kind are typically dealt with by configuring a data transformation engine, that can convert incoming data into the form required by the integrated system. However, determination of a complete and consistent set of data transformations for any given integration task is far from trivial. In this paper, we explore the potential application of techniques for integrity enforcement in supporting this process. We describe the design of a data reconciliation tool (LITCHI) based on these techniques that aims to assist taxonomists in the integration of biodiversity data sets. Our experiences have highlighted several limitations of integrity enforcement when applied to this real world problem, and we describe how we have overcome these in the design of our system.
data integration, data reconciliation, integrity constraints, integrity enforcement, biodiversity information
657-689
Embury, S.M.
d79a13a7-5700-43df-922a-e503afeccafe
Brandt, S.M.
aa133290-9dbb-4f87-be0d-4eb0db5f7d33
Robinson, J.S.
44327568-9728-46ec-89d4-f9fde6a88c63
Sutherland, I.
eab5eeba-8b29-4be1-ac96-02756c06ad7e
Bisby, F.A.
a4683493-90bd-40df-98d4-321059b6fb69
Gray, W.A.
2a1eb454-84e8-4d25-a556-8a208857af46
Jones, A.C.
ba948099-753c-49cc-a860-0aca42f5d49b
White, R.J.
ad383a08-4ca7-4864-9a48-70e60ac0909b
December 2001
Embury, S.M.
d79a13a7-5700-43df-922a-e503afeccafe
Brandt, S.M.
aa133290-9dbb-4f87-be0d-4eb0db5f7d33
Robinson, J.S.
44327568-9728-46ec-89d4-f9fde6a88c63
Sutherland, I.
eab5eeba-8b29-4be1-ac96-02756c06ad7e
Bisby, F.A.
a4683493-90bd-40df-98d4-321059b6fb69
Gray, W.A.
2a1eb454-84e8-4d25-a556-8a208857af46
Jones, A.C.
ba948099-753c-49cc-a860-0aca42f5d49b
White, R.J.
ad383a08-4ca7-4864-9a48-70e60ac0909b
Embury, S.M., Brandt, S.M., Robinson, J.S., Sutherland, I., Bisby, F.A., Gray, W.A., Jones, A.C. and White, R.J.
(2001)
Adapting integrity enforcement techniques for data
reconciliation.
Information Systems, 26 (8), .
(doi:10.1016/S0306-4379(01)00044-8).
Abstract
Integration of data sources opens up possibilities for new and valuable applications of data that cannot be supported by the individual sources alone. Unfortunately, many data integration projects are hindered by the inherent heterogeneities in the sources to be integrated. In particular, differences in the way that real world data is encoded within sources can cause a range of difficulties, not least of which is that the conflicting semantics may not be recognised until the integration project is well under way. Once identified, semantic conflicts of this kind are typically dealt with by configuring a data transformation engine, that can convert incoming data into the form required by the integrated system. However, determination of a complete and consistent set of data transformations for any given integration task is far from trivial. In this paper, we explore the potential application of techniques for integrity enforcement in supporting this process. We describe the design of a data reconciliation tool (LITCHI) based on these techniques that aims to assist taxonomists in the integration of biodiversity data sets. Our experiences have highlighted several limitations of integrity enforcement when applied to this real world problem, and we describe how we have overcome these in the design of our system.
Text
sdarticle.pdf
- Other
Restricted to Registered users only
Request a copy
More information
Published date: December 2001
Keywords:
data integration, data reconciliation, integrity constraints, integrity enforcement, biodiversity information
Identifiers
Local EPrints ID: 30261
URI: http://eprints.soton.ac.uk/id/eprint/30261
ISSN: 0306-4379
PURE UUID: c368fcf1-75c6-4c47-9949-058d8de9122e
Catalogue record
Date deposited: 11 May 2006
Last modified: 15 Mar 2024 07:39
Export record
Altmetrics
Contributors
Author:
S.M. Embury
Author:
S.M. Brandt
Author:
J.S. Robinson
Author:
I. Sutherland
Author:
F.A. Bisby
Author:
W.A. Gray
Author:
A.C. Jones
Author:
R.J. White
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics