The University of Southampton
University of Southampton Institutional Repository

Standard dynamic programming applied to time aggregated Markov decision processes

Standard dynamic programming applied to time aggregated Markov decision processes
Standard dynamic programming applied to time aggregated Markov decision processes

In this note we address the time aggregation approach to ergodic finite state Markov decision processes with uncontrollable states. We propose the use of the time aggregation approach as an intermediate step toward constructing a transformed MDP whose state space is comprised solely of the controllable states. The proposed approach simplifies the iterative search for the optimal solution by eliminating the need to define an equivalent parametric function, and results in a problem that can be solved by simpler, standard MDP algorithms.

Dynamic programing, Markov decision processes, Time aggregation
0191-2216
2576-2580
Arruda, Edilson F.
8eb3bd83-e883-4bf3-bfbc-7887c5daa911
Fragoso, Marcelo D.
7f484139-de97-4458-aa6b-dc3249811a08
Arruda, Edilson F.
8eb3bd83-e883-4bf3-bfbc-7887c5daa911
Fragoso, Marcelo D.
7f484139-de97-4458-aa6b-dc3249811a08

Arruda, Edilson F. and Fragoso, Marcelo D. (2009) Standard dynamic programming applied to time aggregated Markov decision processes. In Proceedings of the 48th IEEE Conference on Decision and Control held jointly with 2009 28th Chinese Control Conference, CDC/CCC 2009. pp. 2576-2580 . (doi:10.1109/CDC.2009.5400692).

Record type: Conference or Workshop Item (Paper)

Abstract

In this note we address the time aggregation approach to ergodic finite state Markov decision processes with uncontrollable states. We propose the use of the time aggregation approach as an intermediate step toward constructing a transformed MDP whose state space is comprised solely of the controllable states. The proposed approach simplifies the iterative search for the optimal solution by eliminating the need to define an equivalent parametric function, and results in a problem that can be solved by simpler, standard MDP algorithms.

Full text not available from this repository.

More information

Published date: 1 December 2009
Venue - Dates: 48th IEEE Conference on Decision and Control held jointly with 2009 28th Chinese Control Conference, CDC/CCC 2009, , Shanghai, China, 2009-12-15 - 2009-12-18
Keywords: Dynamic programing, Markov decision processes, Time aggregation

Identifiers

Local EPrints ID: 445883
URI: http://eprints.soton.ac.uk/id/eprint/445883
ISSN: 0191-2216
PURE UUID: 5879f18c-92ef-4c3c-bc83-aa1dcbb4ba83
ORCID for Edilson F. Arruda: ORCID iD orcid.org/0000-0002-9835-352X

Catalogue record

Date deposited: 13 Jan 2021 17:30
Last modified: 18 Feb 2021 17:42

Export record

Altmetrics

Contributors

Author: Marcelo D. Fragoso

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×