The University of Southampton
University of Southampton Institutional Repository

An Architecture for Management of Large, Distributed, Scientific Data Using SQL/MED and XML

An Architecture for Management of Large, Distributed, Scientific Data Using SQL/MED and XML
An Architecture for Management of Large, Distributed, Scientific Data Using SQL/MED and XML
We have developed a Web-based architecture and user interface for archiving and manipulating results of numerical simulations being generated by the UK Turbulence Consortium on the United Kingdom's new national scientific supercomputing resource. These simulations produce large datasets, requiring Web-based mechanisms for storage, searching and retrieval of simulation results in the hundreds of gigabytes range. We demonstrate that the new DATALINK type, defined in the draft SQL Management of External Data Standard, which facilitates database management of distributed external data, can help to overcome problems associated with limited bandwidth. We show that a database can meet the apparently divergent requirements of storing both the relatively small simulation result metadata, and the large result files, in a unified way, whilst maintaining database security, recovery and integrity. By managing data in this distributed way, the system allows post-processing of archived simulation results to be performed directly without the cost of having to rematerialise to files. This distribution also reduces access bottlenecks and processor loading. We also show that separating the user interface specification from the user interface processing can provide a number of advantages. We provide a tool to generate automatically a default user interface specification, in the form of an XML document, for a given database. The XML document can be customised to change the appearance of the interface. Our architecture can archive not only data in a distributed fashion, but also applications. These applications are loosely coupled to the datasets (in a many-to-many relationship) via XML defined interfaces. They provide reusable server-side post-processing operations such as data reduction and visualisation.
3-540-67227-3
0302-9743
447-461
Papiani, Mark
14debc02-788b-4009-ac89-5883fe5fc606
Wason, Jasmin L.
2da80fb9-5b3b-4391-be5a-32b578566b82
Nicole, Denis A.
0aca6dd1-833f-4544-b7a4-58fb91c7395a
Zaniolo, Carlo
373240c7-4d8c-49d0-9e8e-222c8bccb992
Lockemann, Peter C.
b5b4e76c-33fb-4500-8713-d72a16059b59
Scholl, Marc H.
89dbdb2a-e380-45a8-b995-4f8b6571d513
Grust, TorstenTorsten Grust
4840577b-4f36-44da-97b1-524e3dadb2fe
Papiani, Mark
14debc02-788b-4009-ac89-5883fe5fc606
Wason, Jasmin L.
2da80fb9-5b3b-4391-be5a-32b578566b82
Nicole, Denis A.
0aca6dd1-833f-4544-b7a4-58fb91c7395a
Zaniolo, Carlo
373240c7-4d8c-49d0-9e8e-222c8bccb992
Lockemann, Peter C.
b5b4e76c-33fb-4500-8713-d72a16059b59
Scholl, Marc H.
89dbdb2a-e380-45a8-b995-4f8b6571d513
Grust, TorstenTorsten Grust
4840577b-4f36-44da-97b1-524e3dadb2fe

Papiani, Mark, Wason, Jasmin L. and Nicole, Denis A. , Zaniolo, Carlo, Lockemann, Peter C., Scholl, Marc H. and Grust, TorstenTorsten Grust (eds.) (2000) An Architecture for Management of Large, Distributed, Scientific Data Using SQL/MED and XML. Lecture Notes in Computer Science, 1777, 447-461.

Record type: Article

Abstract

We have developed a Web-based architecture and user interface for archiving and manipulating results of numerical simulations being generated by the UK Turbulence Consortium on the United Kingdom's new national scientific supercomputing resource. These simulations produce large datasets, requiring Web-based mechanisms for storage, searching and retrieval of simulation results in the hundreds of gigabytes range. We demonstrate that the new DATALINK type, defined in the draft SQL Management of External Data Standard, which facilitates database management of distributed external data, can help to overcome problems associated with limited bandwidth. We show that a database can meet the apparently divergent requirements of storing both the relatively small simulation result metadata, and the large result files, in a unified way, whilst maintaining database security, recovery and integrity. By managing data in this distributed way, the system allows post-processing of archived simulation results to be performed directly without the cost of having to rematerialise to files. This distribution also reduces access bottlenecks and processor loading. We also show that separating the user interface specification from the user interface processing can provide a number of advantages. We provide a tool to generate automatically a default user interface specification, in the form of an XML document, for a given database. The XML document can be customised to change the appearance of the interface. Our architecture can archive not only data in a distributed fashion, but also applications. These applications are loosely coupled to the datasets (in a many-to-many relationship) via XML defined interfaces. They provide reusable server-side post-processing operations such as data reduction and visualisation.

Text
mpslides.pdf - Other
Download (622kB)

More information

Published date: 19 February 2000
Additional Information: The associated (3M byte) Postscript file contains the slides from the conference presentation. Address: Springer-Verlag, Berlin
Venue - Dates: Advances in Database Technology - EDBT 2000, 2000-02-19
Organisations: Electronic & Software Systems

Identifiers

Local EPrints ID: 251314
URI: http://eprints.soton.ac.uk/id/eprint/251314
ISBN: 3-540-67227-3
ISSN: 0302-9743
PURE UUID: a92f8e98-e571-40f9-ac96-773de4e6410c

Catalogue record

Date deposited: 06 Nov 2001
Last modified: 14 Mar 2024 05:11

Export record

Contributors

Author: Mark Papiani
Author: Jasmin L. Wason
Author: Denis A. Nicole
Editor: Carlo Zaniolo
Editor: Peter C. Lockemann
Editor: Marc H. Scholl
Editor: TorstenTorsten Grust Grust

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×