An application of convex optimization concepts to approximate dynamic programming
An application of convex optimization concepts to approximate dynamic programming
This paper deals with approximate value iteration (AVI) algorithms applied to discounted dynamic (DP) programming problems. The so-called Bellman residual is shown to be convex in the Banach space of candidate solutions to the DP problem. This fact motivates the introduction of an AVI algorithm with local search that seeks an approximate solution in a lower dimensional space called approximation architecture. The optimality of a point in the approximation architecture is characterized by means of convex optimization concepts and necessary and sufficient conditions to global optimality are derived. To illustrate the method, two examples are presented which were previously explored in the literature.
4238-4243
Arruda, Edilson F.
8eb3bd83-e883-4bf3-bfbc-7887c5daa911
Fragoso, Marcelo D.
7f484139-de97-4458-aa6b-dc3249811a08
Do Val, João Bosco R.
4139d2f5-1439-45d9-a77e-8e7e20ec98b8
30 September 2008
Arruda, Edilson F.
8eb3bd83-e883-4bf3-bfbc-7887c5daa911
Fragoso, Marcelo D.
7f484139-de97-4458-aa6b-dc3249811a08
Do Val, João Bosco R.
4139d2f5-1439-45d9-a77e-8e7e20ec98b8
Arruda, Edilson F., Fragoso, Marcelo D. and Do Val, João Bosco R.
(2008)
An application of convex optimization concepts to approximate dynamic programming.
In 2008 American Control Conference, ACC.
.
(doi:10.1109/ACC.2008.4587159).
Record type:
Conference or Workshop Item
(Paper)
Abstract
This paper deals with approximate value iteration (AVI) algorithms applied to discounted dynamic (DP) programming problems. The so-called Bellman residual is shown to be convex in the Banach space of candidate solutions to the DP problem. This fact motivates the introduction of an AVI algorithm with local search that seeks an approximate solution in a lower dimensional space called approximation architecture. The optimality of a point in the approximation architecture is characterized by means of convex optimization concepts and necessary and sufficient conditions to global optimality are derived. To illustrate the method, two examples are presented which were previously explored in the literature.
This record has no associated files available for download.
More information
Published date: 30 September 2008
Venue - Dates:
2008 American Control Conference, ACC, , Seattle, WA, United States, 2008-06-11 - 2008-06-13
Identifiers
Local EPrints ID: 445870
URI: http://eprints.soton.ac.uk/id/eprint/445870
ISSN: 0743-1619
PURE UUID: 27c1c4f5-af3e-4a2e-9740-969e200f9232
Catalogue record
Date deposited: 12 Jan 2021 17:32
Last modified: 16 Apr 2024 01:59
Export record
Altmetrics
Contributors
Author:
Edilson F. Arruda
Author:
Marcelo D. Fragoso
Author:
João Bosco R. Do Val
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics