The University of Southampton
University of Southampton Institutional Repository

Decentralized Bayesian reinforcement learning for online agent collaboration

Decentralized Bayesian reinforcement learning for online agent collaboration
Decentralized Bayesian reinforcement learning for online agent collaboration
Solving complex but structured problems in a decentralized manner via multiagent collaboration has received much attention in recent years. This is natural, as on one hand, multiagent systems usually possess a structure that determines the allowable interactions among the agents; and on the other hand, the single most pressing need in a cooperative multiagent system is to coordinate the local policies of autonomous agents with restricted capabilities to serve a system-wide goal. The presence of uncertainty makes this even more challenging, as the agents face the additional need to learn the unknown environment parameters while forming (and following) local policies in an online fashion. In this paper, we provide the first Bayesian reinforcement learning (BRL) approach for distributed coordination and learning in a cooperative multiagent system by devising two solutions to this type of problem. More specifically, we show how the Value of Perfect Information (VPI) can be used to perform efficient decentralised exploration in both model-based and model-free BRL, and in the latter case, provide a closed form solution for VPI, correcting a decade old result by Dearden, Friedman and Russell. To evaluate these solutions, we present experimental results comparing their relative merits, and demonstrate empirically that both solutions outperform an existing multiagent learning method, representative of the state-of-the-art.
multiagent learning, Bayesian techniques, uncertainty
417-424
Teacy, W.T.L.
5f962a10-9ab5-4b19-8016-cc72588bdc6a
Chalkiadakis, G.
660ef7d2-f977-43a7-97f2-c85cc3bcf0a4
Farinelli, A.
eda89a3c-23fa-4862-997c-995769ffe747
Rogers, A.
f9130bc6-da32-474e-9fab-6c6cb8077fdc
Jennings, N.R.
ab3d94cc-247c-4545-9d1e-65873d6cdb30
McClean, S.
bbe010c3-4624-4e74-86d9-5153116478bb
Parr, G.
9f789a08-a087-4921-835e-b730e748e77a
Teacy, W.T.L.
5f962a10-9ab5-4b19-8016-cc72588bdc6a
Chalkiadakis, G.
660ef7d2-f977-43a7-97f2-c85cc3bcf0a4
Farinelli, A.
eda89a3c-23fa-4862-997c-995769ffe747
Rogers, A.
f9130bc6-da32-474e-9fab-6c6cb8077fdc
Jennings, N.R.
ab3d94cc-247c-4545-9d1e-65873d6cdb30
McClean, S.
bbe010c3-4624-4e74-86d9-5153116478bb
Parr, G.
9f789a08-a087-4921-835e-b730e748e77a

Teacy, W.T.L., Chalkiadakis, G., Farinelli, A., Rogers, A., Jennings, N.R., McClean, S. and Parr, G. (2012) Decentralized Bayesian reinforcement learning for online agent collaboration. 11th International Conference on Autonomous Agents and Multiagent Systems, Valencia, Spain. 04 - 08 Jun 2012. pp. 417-424 .

Record type: Conference or Workshop Item (Paper)

Abstract

Solving complex but structured problems in a decentralized manner via multiagent collaboration has received much attention in recent years. This is natural, as on one hand, multiagent systems usually possess a structure that determines the allowable interactions among the agents; and on the other hand, the single most pressing need in a cooperative multiagent system is to coordinate the local policies of autonomous agents with restricted capabilities to serve a system-wide goal. The presence of uncertainty makes this even more challenging, as the agents face the additional need to learn the unknown environment parameters while forming (and following) local policies in an online fashion. In this paper, we provide the first Bayesian reinforcement learning (BRL) approach for distributed coordination and learning in a cooperative multiagent system by devising two solutions to this type of problem. More specifically, we show how the Value of Perfect Information (VPI) can be used to perform efficient decentralised exploration in both model-based and model-free BRL, and in the latter case, provide a closed form solution for VPI, correcting a decade old result by Dearden, Friedman and Russell. To evaluate these solutions, we present experimental results comparing their relative merits, and demonstrate empirically that both solutions outperform an existing multiagent learning method, representative of the state-of-the-art.

Text
AAMAS2012_0089_7016861a9.pdf - Version of Record
Download (420kB)

More information

Published date: 4 June 2012
Venue - Dates: 11th International Conference on Autonomous Agents and Multiagent Systems, Valencia, Spain, 2012-06-04 - 2012-06-08
Keywords: multiagent learning, Bayesian techniques, uncertainty
Organisations: Agents, Interactions & Complexity

Identifiers

Local EPrints ID: 273201
URI: http://eprints.soton.ac.uk/id/eprint/273201
PURE UUID: 95803f11-7d3b-4f7b-96df-d63c4eeb389a

Catalogue record

Date deposited: 08 Feb 2012 13:11
Last modified: 14 Mar 2024 10:21

Export record

Contributors

Author: W.T.L. Teacy
Author: G. Chalkiadakis
Author: A. Farinelli
Author: A. Rogers
Author: N.R. Jennings
Author: S. McClean
Author: G. Parr

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×