The University of Southampton
University of Southampton Institutional Repository

XDTM: The XML Dataset Typing and Mapping for specifying datasets

XDTM: The XML Dataset Typing and Mapping for specifying datasets
XDTM: The XML Dataset Typing and Mapping for specifying datasets
We are concerned with the following problem: How do we allow a community of users to access and process diverse data stored in many different formats? Standard data formats and data access APIs can help but are not general solutions because of their assumption of homogeneity. We propose a new approach based on a separation of concerns between logical and physical structure. We use XML Schema as a type system for expressing the logical structure of datasets and define a separate notion of a mapping that combines declarative and procedural elements to describe physical representations. For example, a collection of environmental data might be mapped variously to a set of files, a relational database, or a spreadsheet but can look the same in all three cases to a user or program that accesses the data via its logical structure. This separation of concerns allows us to specify workflows that operate over complex datasets with, for example, selector constructs being used to select and initiate computations on sets of dataset elements|regardless of whether the sets in question are files in a directory, tables in a database, or columns in a spreadsheet. We present the XDTM design and also the results of application experiments with an XDTM prototype.
495-505
Springer
Moreau, L.
033c63dd-3fe9-4040-849f-dfccbe0406f8
Zhao, Y.
d6302903-ff65-45d3-b16b-ccab58d5d835
Foster, I.
449784e5-a710-4add-8e7c-d07b228c710a
Voeckler, J.
9b5f8ad4-db47-4dfd-9ff3-db7608420b93
Wilde, M.
8fb7eda3-02c5-410b-84a3-e10cc26e6064
Sloot, Peter A.
Hoekstra, Alfons G.
Priol, Thierry
Reinefeld, Alexander
Bubak, Marian
Moreau, L.
033c63dd-3fe9-4040-849f-dfccbe0406f8
Zhao, Y.
d6302903-ff65-45d3-b16b-ccab58d5d835
Foster, I.
449784e5-a710-4add-8e7c-d07b228c710a
Voeckler, J.
9b5f8ad4-db47-4dfd-9ff3-db7608420b93
Wilde, M.
8fb7eda3-02c5-410b-84a3-e10cc26e6064
Sloot, Peter A.
Hoekstra, Alfons G.
Priol, Thierry
Reinefeld, Alexander
Bubak, Marian

Moreau, L., Zhao, Y., Foster, I., Voeckler, J. and Wilde, M. (2005) XDTM: The XML Dataset Typing and Mapping for specifying datasets. Sloot, Peter A., Hoekstra, Alfons G., Priol, Thierry, Reinefeld, Alexander and Bubak, Marian (eds.) In EGC'05 Proceedings of the 2005 European conference on Advances in Grid Computing. Springer. pp. 495-505 . (doi:10.1007/11508380_51).

Record type: Conference or Workshop Item (Paper)

Abstract

We are concerned with the following problem: How do we allow a community of users to access and process diverse data stored in many different formats? Standard data formats and data access APIs can help but are not general solutions because of their assumption of homogeneity. We propose a new approach based on a separation of concerns between logical and physical structure. We use XML Schema as a type system for expressing the logical structure of datasets and define a separate notion of a mapping that combines declarative and procedural elements to describe physical representations. For example, a collection of environmental data might be mapped variously to a set of files, a relational database, or a spreadsheet but can look the same in all three cases to a user or program that accesses the data via its logical structure. This separation of concerns allows us to specify workflows that operate over complex datasets with, for example, selector constructs being used to select and initiate computations on sets of dataset elements|regardless of whether the sets in question are files in a directory, tables in a database, or columns in a spreadsheet. We present the XDTM design and also the results of application experiments with an XDTM prototype.

Text
egc05.pdf - Accepted Manuscript
Download (125kB)

More information

Published date: February 2005
Organisations: Web & Internet Science

Identifiers

Local EPrints ID: 261143
URI: http://eprints.soton.ac.uk/id/eprint/261143
PURE UUID: eb007a8e-dde3-4d31-920b-5d2f804dcb86
ORCID for L. Moreau: ORCID iD orcid.org/0000-0002-3494-120X

Catalogue record

Date deposited: 10 Aug 2005
Last modified: 15 Mar 2024 21:35

Export record

Altmetrics

Contributors

Author: L. Moreau ORCID iD
Author: Y. Zhao
Author: I. Foster
Author: J. Voeckler
Author: M. Wilde
Editor: Peter A. Sloot
Editor: Alfons G. Hoekstra
Editor: Thierry Priol
Editor: Alexander Reinefeld
Editor: Marian Bubak

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×