The University of Southampton
University of Southampton Institutional Repository

Markov decision processes with delays and asynchronous cost collection

Markov decision processes with delays and asynchronous cost collection
Markov decision processes with delays and asynchronous cost collection
Markov decision processes (MDPs) may involve three types of delays. First, state information, rather than being available instantaneously, may arrive with a delay (observation delay). Second, an action may take effect at a later decision stage rather than immediately (action delay). Third, the cost induced by an action may be collected after a number of stages (cost delay). We de rive two results, one for constant and one for random delays, for reducing an MDP with delays to an MDP without delays, which differs only in the size of the state space. The results are based on the intuition that costs may be collected asynchronously, i.e., at a stage other than the one in which they are induced, as long as they are discounted properly.
568-574
Katsikopoulos, K.V.
b97c23d9-8b24-4225-8da4-be7ac2a14fba
Engelbrecht, S.E.
4fe61a61-8a9e-40f2-8345-3f896c947b42
Katsikopoulos, K.V.
b97c23d9-8b24-4225-8da4-be7ac2a14fba
Engelbrecht, S.E.
4fe61a61-8a9e-40f2-8345-3f896c947b42

Katsikopoulos, K.V. and Engelbrecht, S.E. (2003) Markov decision processes with delays and asynchronous cost collection. IEEE Transactions Automatic Control, 48 (4), 568-574. (doi:10.1109/TAC.2003.809799).

Record type: Article

Abstract

Markov decision processes (MDPs) may involve three types of delays. First, state information, rather than being available instantaneously, may arrive with a delay (observation delay). Second, an action may take effect at a later decision stage rather than immediately (action delay). Third, the cost induced by an action may be collected after a number of stages (cost delay). We de rive two results, one for constant and one for random delays, for reducing an MDP with delays to an MDP without delays, which differs only in the size of the state space. The results are based on the intuition that costs may be collected asynchronously, i.e., at a stage other than the one in which they are induced, as long as they are discounted properly.

Text
Katsikopoulos Engelbrecht - Version of Record
Restricted to Repository staff only
Request a copy

More information

Published date: April 2003

Identifiers

Local EPrints ID: 415448
URI: http://eprints.soton.ac.uk/id/eprint/415448
PURE UUID: f3d3bcdd-fa5f-433f-ab93-a60e01a441b9
ORCID for K.V. Katsikopoulos: ORCID iD orcid.org/0000-0002-9572-1980

Catalogue record

Date deposited: 10 Nov 2017 17:30
Last modified: 16 Mar 2024 04:27

Export record

Altmetrics

Contributors

Author: S.E. Engelbrecht

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×