Interoperability Challenges in Grid for Industrial Applications

David De Roure
Electronics and Computer Science
University of Southampton, UK
dder@ecs.soton.ac.uk

Mike Surridge
IT Innovation
University of Southampton, UK
ms@it-innovation.soton.ac.uk

Abstract

The vision of the Grid Resources for Industrial Applications (GRIA) project is to make the Grid usable for business and industry: it enables Grid service providers and consumers to come together within a commercial business environment. This brings a strong requirement for a secure and interoperable Grid system which is open at the standards level, makes use of third-party components and can be used by other Grids. The intention is that the system should employ flexible negotiation techniques to enable other parties to join with ease. This raises a spectrum of interoperability challenges, some familiar (resource description, workload estimation, quality of service) but others which are evidently beyond current off-the-shelf solutions. These include representing quality of service for collections of related jobs which are embedded in complex business processes, and the need to describe the semantics of multiparty negotiations including hierarchical conversations. In this paper we describe the GRIA vision and approach, and highlight some of the challenges.

1. Introduction

The Grid Resources for Industrial Applications (GRIA) project has the clear but challenging aim of making the Grid usable for business and industry. This focus distinguishes GRIA from the many academic Grid infrastructures that are under development in the Grid community. The critical issues for business users include security, service levels and interoperability as high priorities. GRIA is funded under the IST programme of the European Commission, and brings together five organisations, three of whom represent the stakeholders in GRIA's vision of a business deployment of the Grid.

In this paper we focus on interoperability, which is our motivation for adoption of Semantic Web technologies. The interoperability challenge is twofold: GRIA aims to make use of third party components and services, and aims to be used by other Grids. To do this it must be open at the standards level. The ease with which interoperability is achieved is the measure of success of this aspect of the project. At one end of the spectrum we could expect third parties to comply with GRIA, at the other we expect GRIA to comply with the rest of the world.

In this paper we first introduce the GRIA approach, then we identify some exemplar interoperability challenges.

2. A Grid for business

The parties that come together within GRIA include people who wish to buy computational resources and those who wish to provide them. The resource could be a general purpose Grid such as a managed PC cluster. However, it might relate to a specific Grid application in which the provider has expertise. By way of example, imagine an organisation which runs a computationally intensive application as part of its routine business: there may be times of peak demand when this service wishes to outsource some of that computation to another provider; there may also be times when they have spare capacity and are prepared to sell their application service to external customers as a value-added service, or simply to make their high performance computing facility available for external use.

For each computation that a customer wishes to run, GRIA enters several phases:

Firstly it identifies available resources which may be appropriate for this job;
It then negotiates access to these resources and agrees service levels;
The service is then performed and the outcome determined;
At some stage, the job is billed to the customer's account.

It needs to do this in a way that is compliant with the business processes (and e-business processes) of the companies involved.

Significantly, the GRIA systems itself does not need to operate the computational resource that is used to run the service - it only needs to have sufficient knowledge of the resource in order to perform the above negotiations and agreements. This is analogous to a company issuing quotes for a service without actually reserving the necessary resources at the time of the quotation; i.e. standard practice. This has an important implication on the system design: the GRIA software runs on a computer somewhere within an organisation and not run on the cluster itself.

The GRIA partners provide case studies in this approach. CESI runs a structural analysis application as part of their energy business; similarly, KINO performs post-production tasks within movie production, such as scene rendering and image enhancement. Meanwhile Dolphin are a supplier of high performance computing - they sell products and also, in GRIA, offer a service based on those products. Hence the partners are providers and users in the extended enterprise scenario described above.

These scenarios raise several challenges for GRIA:

We need to address message security: messages need to be encrypted/decrypted at the appropriate boundaries within the GRIA architecture;
Authentication and integrity checking is needed for the various parties
A dynamic authorisation service is needed in order to enforce business workflows and exclude unwanted clients;
We need capacity management which does not rely on managing the resource pool - the expected behaviour is represented via `pseudo reservation';
As well as application services, we need negotiation services.

3. Quality of Service

When you wish to send a package by a courier agency, you do not need to specify the details of how your requirements will be implemented by the agency - you simply say how big and heavy the package is, how soon it must arrive and how much you are willing to spend. Similarly, in GRIA, the users submit their jobs with a set of requirements, including the deadline for the execution of the job.

At one end of the spectrum of quality of service (QoS), a system might not take into account these requirements - it will simply run the job with the available resources, without utilising reservations. This is `best effort' QoS. At the other end, the system might reserve all the required resources in order to meet the user's requirements.

In practice, we establish a QoS agreement. The user submits the job with some specific requirements, and the system responds to say whether it can satisfy the requirements. It does this by taking into account the available resources, the other jobs running and the expected future job submissions. There is then a negotiation between the user and the system, in order to arrive at an acceptable level of QoS.

In order to obtain this QoS agreement, it is necessary to predict the execution time of a submitted job on each resource. This can be accomplished by a performance estimation service which, in GRIA, is divided into two parts:

Workload Estimation. This occurs on the consumer's (client's) side. The execution time of a job, memory usage and disk space requirements are estimated using application specific parameters.
Capacity Estimation. This occurs on the supplier's (server's) side. The capacity to do the job is estimated using resource specific parameters.

These estimations are represented as vectors containing 10 load parameters which cover CPU, memory, disk, data transfer and network transfer.

4. GRIA interoperability challenges

GRIA is in many ways very conservative, yet it is also adventurous: it uses off-the-shelf e-Commerce technology, and semantic web/semantic grid to do business! In this section we consider three scenarios which raise interoperability challenges.

4.1 Example 1 - querying 3rd party providers

I want to run a very large finite-element analysis, 100000 DOFs, fully dynamic treatment, and I need the result in one hour.

My GRIA client can figure out 10 workload parameters and establish which GRIA service providers can deliver the necessary computation. But none are able to finish the work in the time I need!

I know EPCC carry FEM codes running under GT3, and Stuttgart supplies services over UNICORE to several automotive customers. My needs are small by their standards - but do they have enough spare capacity to meet my deadline, and can I use them?

Question: How can I query other service providers who don't use the same resource and load representations as my grid infrastructure? If they can meet my needs, how will my client-side application submit jobs to their services?

This first challenge is about representing quality of service. We cannot expect everyone to comply with the client's particular representations, but we can expect people to adopt one of a number of available open standards. Hence we need to interoperate between these.

The load and resources vectors are well researched in the Grid community, and in GRIA we have been able to represent these in a variety of ways and demonstrate how we would go about achieving interoperability with other representations. This interoperability issue is also enjoying attention in the GRIP project [1].

This problem becomes more challenging when we need to deal with collections of related jobs (i.e, processes) instead of single jobs.

4.2 Example 2 - multiple related workloads

I work for KINO, and I am making a commercial. KINO is based in Athens and handled much of the TV advertising for the 2004 Olympic Games. Now we are bidding for more business in this lucrative promotional sector, and we need to create a high quality sample as part of our bid. The set is the Olympic Village at the 2012 Games - which doesn't exist yet!

I need a 60-second sequence showing athletes preparing for the final day of competition, and my client (the artistic creator) wants several complete scenarios featuring different athletic events from which to edit the final cut.

I need to render 4500 high-definition frames based on an architect's model. I need the results first thing tomorrow morning, and my budget is only $15 for this job.

Three service providers can meet my needs. One will charge $0.10 per frame, but can handle 25000 frames for overnight delivery. There are cheaper suppliers, and my in-house systems can do it for free (but only if nothing else is running).

Question: how much service should I buy from each service provider, with what cancellation options, and where should I arrange output to be delivered?

Here we have moved from single jobs to combinations of jobs - which may be described as a process. There may be some common inputs, some input-output dependencies and some scope for concurrent execution.

To address this we have been exploring workflow/process representation. It is easy to create and enact (e.g. WSFL) representations, but we cannot reason about them and negotiate QoS. Although there are efforts in this space, such as the vision of OWL-S [2] as regards processes, our experience in GRIA is that there is no off-the-shelf solution to these requirements at this time.

Furthermore, these processes may be embedded in complex business processes, as illustrated by the next example.

4.3 Example 3 - embedding in business processes

Our friend from KINO is desperate. The usual supplier can't take on the excess work after all, and he needs to find other service providers. There are a couple of new guys on the block, but they don't use the normal tender/order/deliver/invoice process:

The National Technical University of Athens provides access to rendering codes. They post Condor class ads describing resources and assign them to the highest bidder using an auction protocol.

The local cable company provides rendering services on their digital video on demand servers during off-peak periods. However, they allow access only to signed-up clients, who can then submit jobs with no QoS negotiation. Delivery is on a `best efforts' basis, and tariffs depend on when each computation is submitted and when the result is needed.

Question: how can KINO's grid-enabled virtual digital studio environment seamlessly adapt to these new business models?

Currently GRIA takes a conservative approach to business processes, reproducing established procurement processes such as invitation to tender, proposals, orders, invoices, etc. We envisage a number of negotiation models, from the lightweight (best-efforts, no negotiation) to the highly specified Service Level Agreement, which may be the subject of iterative negotiation.

We want GRIA to handle arbitrary business processes. To do this we need to represent processes and conversations, including multiparty conversations. The GRIA system has a hierarchical conversation model which does not assume a globally agreed namespace (it uses an `our-ref, your-ref' model). In fact the most relevant work that can deal with the conversations between the various entities, and the negotiation possibilities, is in the field of agent-based computing [3].

We are currently investigating the use of FIPA [4] to represent conversations, a similar approach to that adopted in TAGA [5] where the messages are represented using OWL [6].

5. Conclusions

In this paper we have provided an overview of the GRIA approach and presented three scenarios which illustrate the challenges. What we are doing in GRIA is very much part of the Semantic Grid picture - it can be seen as the Semantic Web in the Grid rather than on the Grid, i.e. we are focusing on systems interoperability rather than the discovery and interoperable use of domain specific data. This work also raises questions about the relation ship between the Semantic Grid and business - we see Grid services as an extended enterprise. Figure 1 illustrates this relationship from the perspective of the `layer cake' approach.

GRIA is ultra-conservative and we are applying very basic business processes using off-the-shelf technology. However, even this leads to profound questions for the Semantic Grid:

What are the standard ways to represent and compute QoS?
How do we represent, and reason about, multiple related jobs?
How do we interoperate even with simple negotiation mechanisms?

We suspect a combination of methods are needed, including semantic web representations, agents and autonomous reasoning and adaptive behaviour.

GRIA Business fullfillment vs Semantic Grid
Figure 1: Business fulfilment and the Semantic Grid

Acknowledgements

GRIA is an IST project (Project Number 33240) funded by the European Commission and coordinated by IT Innovation, University of Southampton, UK. The authors acknowledge the contributions of the other GRIA partners: National Technical University of Athens, Dolphin, CESI and Kino. We are grateful to Terry Payne for discussions about OWL-S.

References

[1]: Grid Interoperability Project, http://www.grid-interoperability.org/
[2]: DAML Services, http://www.daml.org/services/
[3]: C. Bartolini, C. Preist and N. R. Jennings (2002)
[4]: Foundation for Intelligent Physical Agents (FIPA), http://www.fipa.org/
[5]: Youyong Zou, Tim Finin, Li Ding, Harry Chen, and Rong Pan. TAGA: Trading Agent Competition in Agentcities, IJCAI-03 Workshop on Trading Agent Design and Analysis, August 2003.
[6]: OWL Web Ontology Language Guide, see http://www.w3.org/2001/sw/WebOnt/