The University of Southampton
University of Southampton Institutional Repository

An application of convex optimization concepts to approximate dynamic programming

An application of convex optimization concepts to approximate dynamic programming
An application of convex optimization concepts to approximate dynamic programming

This paper deals with approximate value iteration (AVI) algorithms applied to discounted dynamic (DP) programming problems. The so-called Bellman residual is shown to be convex in the Banach space of candidate solutions to the DP problem. This fact motivates the introduction of an AVI algorithm with local search that seeks an approximate solution in a lower dimensional space called approximation architecture. The optimality of a point in the approximation architecture is characterized by means of convex optimization concepts and necessary and sufficient conditions to global optimality are derived. To illustrate the method, two examples are presented which were previously explored in the literature.

0743-1619
4238-4243
Arruda, Edilson F.
8eb3bd83-e883-4bf3-bfbc-7887c5daa911
Fragoso, Marcelo D.
7f484139-de97-4458-aa6b-dc3249811a08
Do Val, João Bosco R.
4139d2f5-1439-45d9-a77e-8e7e20ec98b8
Arruda, Edilson F.
8eb3bd83-e883-4bf3-bfbc-7887c5daa911
Fragoso, Marcelo D.
7f484139-de97-4458-aa6b-dc3249811a08
Do Val, João Bosco R.
4139d2f5-1439-45d9-a77e-8e7e20ec98b8

Arruda, Edilson F., Fragoso, Marcelo D. and Do Val, João Bosco R. (2008) An application of convex optimization concepts to approximate dynamic programming. In 2008 American Control Conference, ACC. pp. 4238-4243 . (doi:10.1109/ACC.2008.4587159).

Record type: Conference or Workshop Item (Paper)

Abstract

This paper deals with approximate value iteration (AVI) algorithms applied to discounted dynamic (DP) programming problems. The so-called Bellman residual is shown to be convex in the Banach space of candidate solutions to the DP problem. This fact motivates the introduction of an AVI algorithm with local search that seeks an approximate solution in a lower dimensional space called approximation architecture. The optimality of a point in the approximation architecture is characterized by means of convex optimization concepts and necessary and sufficient conditions to global optimality are derived. To illustrate the method, two examples are presented which were previously explored in the literature.

This record has no associated files available for download.

More information

Published date: 30 September 2008
Venue - Dates: 2008 American Control Conference, ACC, , Seattle, WA, United States, 2008-06-11 - 2008-06-13

Identifiers

Local EPrints ID: 445870
URI: http://eprints.soton.ac.uk/id/eprint/445870
ISSN: 0743-1619
PURE UUID: 27c1c4f5-af3e-4a2e-9740-969e200f9232
ORCID for Edilson F. Arruda: ORCID iD orcid.org/0000-0002-9835-352X

Catalogue record

Date deposited: 12 Jan 2021 17:32
Last modified: 16 Apr 2024 01:59

Export record

Altmetrics

Contributors

Author: Edilson F. Arruda ORCID iD
Author: Marcelo D. Fragoso
Author: João Bosco R. Do Val

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×