Optimal approximation schedules for iterative algorithms with application to dynamic programming
Optimal approximation schedules for iterative algorithms with application to dynamic programming
Many iterative algorithms rely on operators which may be difficult or impossible to evaluate exactly, but for which approximations are available. Furthermore, a graduated range of approximations may be available, inducing a functional relationship between computational complexity and approximation tolerance. In such a case, a reasonable strategy would be to vary tolerance over iterations, starting with a cruder approximation, then gradually decreasing tolerance as the solution is approached. In this article, it is shown that under general conditions, for linearly convergent algorithms the optimal choice of approximation tolerance convergence rate is the same linear convergence rate as the exact algorithm itself, regardless of the tolerance/complexity relationship. We illustrate this result by presenting a partial information value iteration (PIVI) algorithm for discrete time dynamic programming problems. The proposed algorithm makes use of increasingly accurate partial model information in order to decrease the computational burden of the standard value iteration algorithm. The algorithm is applied to a stochastic network example and compared to value iteration for the purpose of benchmarking.
4087-4094
Almudevar, Anthony
f0998a97-a377-41a9-82d0-0c1de5f33688
Arruda, Edilson F.
8eb3bd83-e883-4bf3-bfbc-7887c5daa911
1 December 2007
Almudevar, Anthony
f0998a97-a377-41a9-82d0-0c1de5f33688
Arruda, Edilson F.
8eb3bd83-e883-4bf3-bfbc-7887c5daa911
Almudevar, Anthony and Arruda, Edilson F.
(2007)
Optimal approximation schedules for iterative algorithms with application to dynamic programming.
In Proceedings of the 46th IEEE Conference on Decision and Control 2007, CDC.
.
(doi:10.1109/CDC.2007.4434681).
Record type:
Conference or Workshop Item
(Paper)
Abstract
Many iterative algorithms rely on operators which may be difficult or impossible to evaluate exactly, but for which approximations are available. Furthermore, a graduated range of approximations may be available, inducing a functional relationship between computational complexity and approximation tolerance. In such a case, a reasonable strategy would be to vary tolerance over iterations, starting with a cruder approximation, then gradually decreasing tolerance as the solution is approached. In this article, it is shown that under general conditions, for linearly convergent algorithms the optimal choice of approximation tolerance convergence rate is the same linear convergence rate as the exact algorithm itself, regardless of the tolerance/complexity relationship. We illustrate this result by presenting a partial information value iteration (PIVI) algorithm for discrete time dynamic programming problems. The proposed algorithm makes use of increasingly accurate partial model information in order to decrease the computational burden of the standard value iteration algorithm. The algorithm is applied to a stochastic network example and compared to value iteration for the purpose of benchmarking.
This record has no associated files available for download.
More information
Published date: 1 December 2007
Venue - Dates:
46th IEEE Conference on Decision and Control 2007, CDC, , New Orleans, LA, United States, 2007-12-12 - 2007-12-14
Identifiers
Local EPrints ID: 445863
URI: http://eprints.soton.ac.uk/id/eprint/445863
ISSN: 0191-2216
PURE UUID: 46dafeef-5709-4102-bf2a-640c9cf652d7
Catalogue record
Date deposited: 12 Jan 2021 17:32
Last modified: 17 Mar 2024 04:04
Export record
Altmetrics
Contributors
Author:
Anthony Almudevar
Author:
Edilson F. Arruda
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics