Learning to learn with active adaptive perception
Learning to learn with active adaptive perception
Increasingly, autonomous agents will be required to operate on long-term missions. This will create a demand for general intelligence because feedback from a human operator may be sparse and delayed, and because not all behaviours can be prescribed. Deep neural networks and reinforcement learning methods can be applied in such environments but their fixed updating routines imply an inductive bias in learning spatio-temporal patterns, meaning some environments will be unsolvable. To address this problem, this paper proposes active adaptive perception, the ability of an architecture to learn when and how to modify and selectively utilise its perception module. To achieve this, a generic architecture based on a self-modifying policy (SMP) is proposed, and implemented using Incremental Self-improvement with the Success Story Algorithm. The architecture contrasts to deep reinforcement learning systems which follow fixed training strategies and earlier SMP studies which for perception relied either entirely on the working memory or on untrainable active perception instructions. One computationally cheap and one more expensive implementation are presented and compared to DRQN, an off-policy deep reinforcement learner using experience replay and Incremental Self-improvement, an SMP, on various non episodic partially observable mazes. The results show that the simple instruction set leads to emergent strategies to avoid detracting corridors and rooms, and that the expensive implementation allows selectively ignoring perception where it is inaccurate.
adaptive perception, inductive bias, self-modifying policies, reinforcement learning, partial observability
30-49
Bossens, David
633a4d28-2e59-4343-98fe-283082ba1873
Townsend, Nicholas
3a4b47c5-0e76-4ae0-a086-cf841d610ef0
Sobey, Adam
e850606f-aa79-4c99-8682-2cfffda3cd28
July 2019
Bossens, David
633a4d28-2e59-4343-98fe-283082ba1873
Townsend, Nicholas
3a4b47c5-0e76-4ae0-a086-cf841d610ef0
Sobey, Adam
e850606f-aa79-4c99-8682-2cfffda3cd28
Bossens, David, Townsend, Nicholas and Sobey, Adam
(2019)
Learning to learn with active adaptive perception.
Neural Networks, 115, .
(doi:10.1016/j.neunet.2019.03.006).
Abstract
Increasingly, autonomous agents will be required to operate on long-term missions. This will create a demand for general intelligence because feedback from a human operator may be sparse and delayed, and because not all behaviours can be prescribed. Deep neural networks and reinforcement learning methods can be applied in such environments but their fixed updating routines imply an inductive bias in learning spatio-temporal patterns, meaning some environments will be unsolvable. To address this problem, this paper proposes active adaptive perception, the ability of an architecture to learn when and how to modify and selectively utilise its perception module. To achieve this, a generic architecture based on a self-modifying policy (SMP) is proposed, and implemented using Incremental Self-improvement with the Success Story Algorithm. The architecture contrasts to deep reinforcement learning systems which follow fixed training strategies and earlier SMP studies which for perception relied either entirely on the working memory or on untrainable active perception instructions. One computationally cheap and one more expensive implementation are presented and compared to DRQN, an off-policy deep reinforcement learner using experience replay and Incremental Self-improvement, an SMP, on various non episodic partially observable mazes. The results show that the simple instruction set leads to emergent strategies to avoid detracting corridors and rooms, and that the expensive implementation allows selectively ignoring perception where it is inaccurate.
Text
NEUNET-D-18-00187R3
- Accepted Manuscript
More information
Accepted/In Press date: 12 March 2019
e-pub ahead of print date: 25 March 2019
Published date: July 2019
Keywords:
adaptive perception, inductive bias, self-modifying policies, reinforcement learning, partial observability
Identifiers
Local EPrints ID: 430578
URI: http://eprints.soton.ac.uk/id/eprint/430578
PURE UUID: cc9ed202-352a-415c-bab1-b197cd51bc20
Catalogue record
Date deposited: 03 May 2019 16:30
Last modified: 16 Mar 2024 07:48
Export record
Altmetrics
Contributors
Author:
David Bossens
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics