Experience-based Reinforcement Learning to Acquire Effective Behavior in a Multiagent Domain
Experience-based Reinforcement Learning to Acquire Effective Behavior in a Multiagent Domain
Abstract. In this paper, we discuss Profit-sharing, an experience-based reinforcement learning approach (which is similar to a Monte-Carlo based reinforcement learning method) that can be used to learn robust and effective actions within uncertain, dynamic, multi-agent systems. We introduce the cut-loop routine that discards looping behavior, and demonstrate its effectiveness empirically within a simplified NEO (non-combatant evacuation operation) domain. This domain consists of several agents which ferry groups of evacuees to one of several shelters. We demonstrate that the cut-loop routine makes the Profit-sharing approach adaptive and robust within a dynamic and uncertain domain, without the need for pre-defined knowledge or subgoals. We also compare it empirically with the popular Q-learning approach.
125-135
Arai, Sachiyo
a9b81f3c-9110-4657-aa2d-e6acd75adfec
Sycara, Katia
df200c43-d34d-4093-bb4e-493fea2d0732
Payne, Terry R.
0bb13d45-2735-45a3-b72c-472fddbd0bb4
2000
Arai, Sachiyo
a9b81f3c-9110-4657-aa2d-e6acd75adfec
Sycara, Katia
df200c43-d34d-4093-bb4e-493fea2d0732
Payne, Terry R.
0bb13d45-2735-45a3-b72c-472fddbd0bb4
Arai, Sachiyo, Sycara, Katia and Payne, Terry R.
(2000)
Experience-based Reinforcement Learning to Acquire Effective Behavior in a Multiagent Domain.
The Sixth Pacific Rim International Conference on Artificial Intelligence (PRICAI 2000).
.
Record type:
Conference or Workshop Item
(Paper)
Abstract
Abstract. In this paper, we discuss Profit-sharing, an experience-based reinforcement learning approach (which is similar to a Monte-Carlo based reinforcement learning method) that can be used to learn robust and effective actions within uncertain, dynamic, multi-agent systems. We introduce the cut-loop routine that discards looping behavior, and demonstrate its effectiveness empirically within a simplified NEO (non-combatant evacuation operation) domain. This domain consists of several agents which ferry groups of evacuees to one of several shelters. We demonstrate that the cut-loop routine makes the Profit-sharing approach adaptive and robust within a dynamic and uncertain domain, without the need for pre-defined knowledge or subgoals. We also compare it empirically with the popular Q-learning approach.
Text
pricai2000.pdf
- Other
More information
Published date: 2000
Venue - Dates:
The Sixth Pacific Rim International Conference on Artificial Intelligence (PRICAI 2000), 2000-01-01
Organisations:
Electronics & Computer Science
Identifiers
Local EPrints ID: 257790
URI: http://eprints.soton.ac.uk/id/eprint/257790
PURE UUID: 8dfcafb1-03cf-4e26-9fdb-fa46a8539726
Catalogue record
Date deposited: 24 Jun 2003
Last modified: 14 Mar 2024 06:03
Export record
Contributors
Author:
Sachiyo Arai
Author:
Katia Sycara
Author:
Terry R. Payne
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics