The University of Southampton
University of Southampton Institutional Repository

Deep intrinsically motivated continuous actor-critic for efficient robotic visuomotor skill learning

Deep intrinsically motivated continuous actor-critic for efficient robotic visuomotor skill learning
Deep intrinsically motivated continuous actor-critic for efficient robotic visuomotor skill learning
In this paper, we present a new intrinsically motivated actor-critic algorithm for learning continuous motor skills directly from raw visual input. Our neural architecture is composed of a critic and an actor network. Both networks receive the hidden representation of a deep con-volutional autoencoder which is trained to reconstruct the visual input, while the centre-most hidden representation is also optimized to estimate the state value. Separately, an ensemble of predictive world models generates, based on its learning progress, an intrinsic reward signal which is combined with the extrinsic reward to guide the exploration of the actor-critic learner. Our approach is more data-efficient and inherently more stable than the existing actor-critic methods for continuous control from pixel data. We evaluate our algorithm for the task of learning nrobotic reaching and grasping skills on a realistic physics simulator and on a humanoid robot. The results show that the control policies learned with our approach can achieve better performance than the compared state-of-the-art and baseline algorithms in both dense-reward and challenging sparse-reward settings.
2081-4836
14-29
Hafez, Muhammad Burhan
e8c991ab-d800-46f2-abeb-cb169a1ed47e
Weber, Cornelius
4e097e6c-840c-460a-8572-e8759f137e43
Kerzel, Matthias
a7ec71f0-3fa1-4acb-a46b-9198ba76ff14
Wermter, Stefan
80682cc6-4251-420a-af8a-f4d616fb0fcc
Hafez, Muhammad Burhan
e8c991ab-d800-46f2-abeb-cb169a1ed47e
Weber, Cornelius
4e097e6c-840c-460a-8572-e8759f137e43
Kerzel, Matthias
a7ec71f0-3fa1-4acb-a46b-9198ba76ff14
Wermter, Stefan
80682cc6-4251-420a-af8a-f4d616fb0fcc

Hafez, Muhammad Burhan, Weber, Cornelius, Kerzel, Matthias and Wermter, Stefan (2019) Deep intrinsically motivated continuous actor-critic for efficient robotic visuomotor skill learning. Paladyn, Journal of Behavioral Robotics, 10 (1), 14-29. (doi:10.1515/pjbr-2019-0005).

Record type: Article

Abstract

In this paper, we present a new intrinsically motivated actor-critic algorithm for learning continuous motor skills directly from raw visual input. Our neural architecture is composed of a critic and an actor network. Both networks receive the hidden representation of a deep con-volutional autoencoder which is trained to reconstruct the visual input, while the centre-most hidden representation is also optimized to estimate the state value. Separately, an ensemble of predictive world models generates, based on its learning progress, an intrinsic reward signal which is combined with the extrinsic reward to guide the exploration of the actor-critic learner. Our approach is more data-efficient and inherently more stable than the existing actor-critic methods for continuous control from pixel data. We evaluate our algorithm for the task of learning nrobotic reaching and grasping skills on a realistic physics simulator and on a humanoid robot. The results show that the control policies learned with our approach can achieve better performance than the compared state-of-the-art and baseline algorithms in both dense-reward and challenging sparse-reward settings.

Text
10.1515_pjbr-2019-0005 - Version of Record
Download (2MB)

More information

Published date: 1 January 2019

Identifiers

Local EPrints ID: 495810
URI: http://eprints.soton.ac.uk/id/eprint/495810
ISSN: 2081-4836
PURE UUID: b41f11a7-417f-435d-b1d9-b4d4b3a2bc0d
ORCID for Muhammad Burhan Hafez: ORCID iD orcid.org/0000-0003-1670-8962

Catalogue record

Date deposited: 22 Nov 2024 18:08
Last modified: 23 Nov 2024 03:11

Export record

Altmetrics

Contributors

Author: Muhammad Burhan Hafez ORCID iD
Author: Cornelius Weber
Author: Matthias Kerzel
Author: Stefan Wermter

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×