The University of Southampton
University of Southampton Institutional Repository

Managing very large distributed datasets on a data grid

Managing very large distributed datasets on a data grid
Managing very large distributed datasets on a data grid
In this work we address the management of very large datasets, which need to be stored and processed across many computing sites. The motivation for our work is the ATLAS Experiment for the Large Hadron Collider, where the authors have been involved in the development of the data management middleware. This middleware, called DQ2, has been used for the last several years by the ATLAS Experiment for shipping petabytes of data to research centres and universities worldwide. We describe our experience in developing and deploying DQ2 on the Worldwide LHC Computing Grid, a production Grid infrastructure formed of hundreds of computing sites. From this operational experience, we have identified an important degree of uncertainty that underlies the behaviour of large Grid infrastructures. This uncertainty is subjected to a detailed analysis, leading us to present novel modeling and simulation techniques for Data Grids. In addition, we discuss what we perceive as practical limits to the development of data distribution algorithms for Data Grids given the underlying infrastructure uncertainty, and propose future research directions.
distributed systems, data management, grid computing, modelling, simulation
1532-0626
1338-1364
de Oliveira Branco, Miguel
4e506200-44aa-4ce1-a899-266aa029ab74
Zaluska, Ed
43f6a989-9542-497e-bc9d-fe20f03cad35
De Roure, David
02879140-3508-4db9-a7f4-d114421375da
Lassnig, Mario
b9b2ba3d-5a4a-4da8-895f-5be4a3d63420
Garonne, Vincent
fcfea6d1-3d7f-4d6a-a237-b0467f897207
de Oliveira Branco, Miguel
4e506200-44aa-4ce1-a899-266aa029ab74
Zaluska, Ed
43f6a989-9542-497e-bc9d-fe20f03cad35
De Roure, David
02879140-3508-4db9-a7f4-d114421375da
Lassnig, Mario
b9b2ba3d-5a4a-4da8-895f-5be4a3d63420
Garonne, Vincent
fcfea6d1-3d7f-4d6a-a237-b0467f897207

de Oliveira Branco, Miguel, Zaluska, Ed, De Roure, David, Lassnig, Mario and Garonne, Vincent (2009) Managing very large distributed datasets on a data grid. Concurrency and Computation: Practice and Experience, 22 (11), 1338-1364. (doi:10.1002/cpe.1489).

Record type: Article

Abstract

In this work we address the management of very large datasets, which need to be stored and processed across many computing sites. The motivation for our work is the ATLAS Experiment for the Large Hadron Collider, where the authors have been involved in the development of the data management middleware. This middleware, called DQ2, has been used for the last several years by the ATLAS Experiment for shipping petabytes of data to research centres and universities worldwide. We describe our experience in developing and deploying DQ2 on the Worldwide LHC Computing Grid, a production Grid infrastructure formed of hundreds of computing sites. From this operational experience, we have identified an important degree of uncertainty that underlies the behaviour of large Grid infrastructures. This uncertainty is subjected to a detailed analysis, leading us to present novel modeling and simulation techniques for Data Grids. In addition, we discuss what we perceive as practical limits to the development of data distribution algorithms for Data Grids given the underlying infrastructure uncertainty, and propose future research directions.

Text
cpepaper_revised.pdf - Version of Record
Restricted to Repository staff only
Request a copy

More information

Published date: 21 August 2009
Keywords: distributed systems, data management, grid computing, modelling, simulation
Organisations: Web & Internet Science

Identifiers

Local EPrints ID: 267753
URI: http://eprints.soton.ac.uk/id/eprint/267753
ISSN: 1532-0626
PURE UUID: e5a10ac5-66bb-4b95-a6a6-466b8d67d53b
ORCID for David De Roure: ORCID iD orcid.org/0000-0001-9074-3016

Catalogue record

Date deposited: 04 Aug 2009 13:16
Last modified: 14 Mar 2024 08:57

Export record

Altmetrics

Contributors

Author: Miguel de Oliveira Branco
Author: Ed Zaluska
Author: David De Roure ORCID iD
Author: Mario Lassnig
Author: Vincent Garonne

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×