Theory and practice of coordination algorithms exploiting the generalised distributive law

Delle Fave, Francesco Maria (2012) Theory and practice of coordination algorithms exploiting the generalised distributive law. University of Southampton, Faculty of Physical and Applied Sciences, Doctoral Thesis, 108pp.

Record type: Thesis (Doctoral)

Abstract

A key challenge for modern computer science is the development of technologies that allow interacting computer systems, typically referred as agents, to coordinate their decisions whilst operating in an environment with minimal human intervention. By so doing, the decision making capabilities of each of these agents should be improved by making decisions that take into account what the remaining agents intend to do. Against this background, the focus of this thesis is to study and design new coordination algorithms capable of achieving this improved performance.

In this line of work, there are two key research challenges that need to be addressed. First, the current state-of-the-art coordination algorithms have only been tested in simulation. This means that their practical performance still needs to be demonstrated in the real world. Second, none of the existing algorithms are capable of solving problems where the agents need to coordinate over complex decisions which typically require to trade off several parameters such as multiple objectives, the parameters of a sufficient statistic and the sample value and the bounds of an estimator. However, such parameters typically characterise the agents’ interactions within many real world domains. For this reason, deriving algorithms capable of addressing such complex interactions is a key challenge to bring research in coordination algorithms one step closer to successful deployment.

The aim of this thesis is to address these two challenges. To achieve this, we make two types of contribution. First, we develop a set practical contributions to address the challenge of testing the performance of state-of-the-art coordination algorithms in the real world. More specifically, we perform a case study on the deployment of the max-sum algorithm, a well known coordination algorithm, on a system that is couched in terms of allowing the first responders at the scene of a disaster to request imagery collection tasks of some of the most relevant areas to a team of unmanned aerial vehicles (UAVs). These agents then coordinate to complete the largest number of tasks. In more detail, max-sum is based on the generalised distributive law (GDL), a well known algebraic framework that has been used in disciplines such as artificial intelligence, machine learning and statistical physics, to derive effective algorithms to solve optimisation problems. Our iv contribution is the deployment of max-sum on real hardware and the evaluation of its performance in a real world setting. More specifically, we deploy max-sum on two UAVs (hexacopters) and test it a number of different settings. These tests show that max-sum does indeed perform well when confronted with the complexity and the unpredictability of the real world.

The second category of contributions are theoretical in nature. More specifically, we propose a new framework and a set of solution techniques to address the complex interactions requirement. To achieve this, we move back to theory and tackle a new class of problem involving agents engaged in complex interactions defined by multiple parameters. We name this class partially ordered distributed constraint optimisation problems (PO-DCOPs). Essentially, this generalises the well known distributed constraint optimisation problem (DCOP) framework to settings in which agents make decisions over multiple parameters such as multiple objectives, the parameters of a sufficient statistic and the sample value and the bounds of an estimator. To measure the quality of these decisions, it becomes necessary to strike a balance between these parameters and to achieve this, the outcome of these decisions is represented using partially ordered constraint functions.

Given this framework, we present three sub-classes of PO-DCOPs, each focusing on a different type of complex interaction. More specifically, we study (i) multi-objective DCOPs (MO-DCOPs) in which the agents’ decisions are defined over multiple objectives, (ii) risk-aware DCOPs (RA-DCOPs) in which the outcome of the agents’ decisions is not known with certainty and thus, where the agents need to carefully weigh the risk of making decisions that might lead to poor and unexpected outcomes and, (iii) multiarm bandit DCOPs (MAB-DCOPs) where the agents need to learn the outcome of their decisions online. To solve these problems, we again exploit the GDL framework. In particular, we employ the flexibility of the GDL to obtain either optimal or bounded approximate algorithms to solve PO-DCOPs. The key insight is to use the algebraic properties of the GDL to instantiate well known DCOP algorithms such as DPOP, Action GDL or bounded max-sum to solve PO-DCOPs. Given the properties of these algorithms, we derive a new set of solution techniques. To demonstrate their effectiveness, we study the properties of these algorithms empirically on various instances of MO-DCOPs, RA-DCOPs and MAB-DCOPs. Our experiments emphasize two key traits of the algorithms. First, bounded approximate algorithms perform well in terms of our requirements. Second, optimal algorithms incur an increase in both the computation and communication load necessary to solve PO-DCOPs because they are trying to optimally solve a problem which is potentially more complex than canonical DCOPs.

Text

PhD_thesis.pdf - Other

Download (4MB)