Managing very large distributed datasets on a data grid

de Oliveira Branco, Miguel, Zaluska, Ed, De Roure, David, Lassnig, Mario and Garonne, Vincent (2009) Managing very large distributed datasets on a data grid. Concurrency and Computation: Practice and Experience, 22, (11), 1338-1364. (doi:10.1002/cpe.1489).


[img] PDF - Version of Record
Restricted to System admin

Download (1703Kb) | Request a copy


In this work we address the management of very large datasets, which need to be stored and processed across many computing sites. The motivation for our work is the ATLAS Experiment for the Large Hadron Collider, where the authors have been involved in the development of the data management middleware. This middleware, called DQ2, has been used for the last several years by the ATLAS Experiment for shipping petabytes of data to research centres and universities worldwide. We describe our experience in developing and deploying DQ2 on the Worldwide LHC Computing Grid, a production Grid infrastructure formed of hundreds of computing sites. From this operational experience, we have identied an important degree of uncertainty that underlies the behaviour of large Grid infrastructures. This uncertainty is subjected to a detailed analysis, leading us to present novel modeling and simulation techniques for Data Grids. In addition, we discuss what we perceive as practical limits to the development of data distribution algorithms for Data Grids given the underlying infrastructure uncertainty, and propose future research directions.

Item Type: Article
Digital Object Identifier (DOI): doi:10.1002/cpe.1489
ISSNs: 1532-0626 (print)
1532-0634 (electronic)
Keywords: distributed systems, data management, grid computing, modelling, simulation
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions : Faculty of Physical Sciences and Engineering > Electronics and Computer Science > Web & Internet Science
ePrint ID: 267753
Accepted Date and Publication Date:
21 August 2009Published
Date Deposited: 04 Aug 2009 13:16
Last Modified: 31 Mar 2016 14:15
Further Information:Google Scholar

Actions (login required)

View Item View Item

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics