Thompson sampling based Monte-Carlo planning in POMDPs


Bai, Aijun, Wu, Feng, Zhang, Zongzhang and Chen, Xiaoping (2014) Thompson sampling based Monte-Carlo planning in POMDPs. In, Proceedings of the 24th International Conference on Automated Planning and Scheduling (ICAPS-14), Portsmouth, US, 21 - 26 Jun 2014. (In Press).

Download

[img]
Preview
PDF - Pre print
Download (353Kb) | Preview

Description/Abstract

Monte-Carlo tree search (MCTS) has been drawing
great interest in recent years for planning under uncertainty. One of the key challenges is the tradeoff
between exploration and exploitation. To address
this, we introduce a novel online planning algorithm
for large POMDPs using Thompson sampling based
MCTS that balances between cumulative and simple regrets.
The proposed algorithm — Dirichlet-Dirichlet-
NormalGamma based Partially Observable Monte-
Carlo Planning (D2NG-POMCP) — treats the accumulated
reward of performing an action from a belief
state in the MCTS search tree as a random variable following
an unknown distribution with hidden parameters.
Bayesian method is used to model and infer the
posterior distribution of these parameters by choosing
the conjugate prior in the form of a combination of two
Dirichlet and one NormalGamma distributions. Thompson
sampling is exploited to guide the action selection in
the search tree. Experimental results confirmed that our
algorithm outperforms the state-of-the-art approaches
on several common benchmark problems.

Item Type: Conference or Workshop Item (Paper)
Related URLs:
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Faculty of Physical Sciences and Engineering > Electronics and Computer Science > Agents, Interactions & Complexity
ePrint ID: 360985
Date Deposited: 14 Jan 2014 10:39
Last Modified: 27 Mar 2014 21:14
Further Information:Google Scholar
URI: http://eprints.soton.ac.uk/id/eprint/360985

Actions (login required)

View Item View Item

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics