Approximate dynamic programming based on expansive projections
Approximate dynamic programming based on expansive projections
We present a general method to obtain convergent approximate value iteration algorithms with function approximation. The result is applicable to any arbitrary approximation architecture and generalizes existing results in the literature derived for particular approximation schemes. Additionally, we show how to obtain a convergent approximate mapping whose fixed point is the projection in the approximation space of a fixed point of the exact dynamic programming mapping with regards to a suitable subset norm. This result relies on evaluating the difference between successive iterates in the selected subset norm, which provides convergent procedures for any arbitrary approximation architecture.
5537-5542
Arruda, Edilson F.
8eb3bd83-e883-4bf3-bfbc-7887c5daa911
Do Val, João B.R.
4139d2f5-1439-45d9-a77e-8e7e20ec98b8
1 January 2006
Arruda, Edilson F.
8eb3bd83-e883-4bf3-bfbc-7887c5daa911
Do Val, João B.R.
4139d2f5-1439-45d9-a77e-8e7e20ec98b8
Arruda, Edilson F. and Do Val, João B.R.
(2006)
Approximate dynamic programming based on expansive projections.
In Proceedings of the 45th IEEE Conference on Decision and Control 2006, CDC.
IEEE.
.
(doi:10.1109/cdc.2006.376823).
Record type:
Conference or Workshop Item
(Paper)
Abstract
We present a general method to obtain convergent approximate value iteration algorithms with function approximation. The result is applicable to any arbitrary approximation architecture and generalizes existing results in the literature derived for particular approximation schemes. Additionally, we show how to obtain a convergent approximate mapping whose fixed point is the projection in the approximation space of a fixed point of the exact dynamic programming mapping with regards to a suitable subset norm. This result relies on evaluating the difference between successive iterates in the selected subset norm, which provides convergent procedures for any arbitrary approximation architecture.
This record has no associated files available for download.
More information
Published date: 1 January 2006
Venue - Dates:
45th IEEE Conference on Decision and Control 2006, CDC, , San Diego, CA, United States, 2006-12-13 - 2006-12-15
Identifiers
Local EPrints ID: 445711
URI: http://eprints.soton.ac.uk/id/eprint/445711
ISSN: 0191-2216
PURE UUID: e8bae917-baee-42fd-afd4-47745815e2b7
Catalogue record
Date deposited: 06 Jan 2021 17:41
Last modified: 16 Apr 2024 01:59
Export record
Altmetrics
Contributors
Author:
Edilson F. Arruda
Author:
João B.R. Do Val
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics