Learning behaviour-performance maps with meta-evolution
Learning behaviour-performance maps with meta-evolution
The MAP-Elites quality-diversity algorithm has been successful in robotics because it can create a behaviorally diverse set of solutions that later can be used for adaptation, for instance to unanticipated damages. In MAP-Elites, the choice of the behaviour space is essential for adaptation, the recovery of performance in unseen environments, since it defines the diversity of the solutions. Current practice is to hand-code a set of behavioural features, however, given the large space of possible behaviour-performance maps, the designer does not know a priori which behavioural features maximise a map's adaptation potential. We introduce a new meta-evolution algorithm that discovers those behavioural features that maximise future adaptations. The proposed method applies Covariance Matrix Adaptation Evolution Strategy to evolve a population of behaviour-performance maps to maximise a meta-fitness function that rewards adaptation. The method stores solutions found by MAP-Elites in a database which allows to rapidly construct new behaviour-performance maps on-the-fly. To evaluate this system, we study the gait of the RHex robot as it adapts to a range of damages sustained on its legs. When compared to MAP-Elites with user-defined behaviour spaces, we demonstrate that the meta-evolution system learns high-performing gaits with or without damages injected to the robot.
Behavioural diversity, Damage recovery, Evolutionary robotics, Meta-learning, Quality-diversity algorithms
49-57
Bossens, David
633a4d28-2e59-4343-98fe-283082ba1873
Mouret, Jean-Baptiste
a837dbc0-1852-4e6f-93d8-41d927305eaf
Tarapore, Danesh
fe8ec8ae-1fad-4726-abef-84b538542ee4
25 June 2020
Bossens, David
633a4d28-2e59-4343-98fe-283082ba1873
Mouret, Jean-Baptiste
a837dbc0-1852-4e6f-93d8-41d927305eaf
Tarapore, Danesh
fe8ec8ae-1fad-4726-abef-84b538542ee4
Bossens, David, Mouret, Jean-Baptiste and Tarapore, Danesh
(2020)
Learning behaviour-performance maps with meta-evolution.
In GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference.
.
(doi:10.1145/3377930.3390181).
Record type:
Conference or Workshop Item
(Paper)
Abstract
The MAP-Elites quality-diversity algorithm has been successful in robotics because it can create a behaviorally diverse set of solutions that later can be used for adaptation, for instance to unanticipated damages. In MAP-Elites, the choice of the behaviour space is essential for adaptation, the recovery of performance in unseen environments, since it defines the diversity of the solutions. Current practice is to hand-code a set of behavioural features, however, given the large space of possible behaviour-performance maps, the designer does not know a priori which behavioural features maximise a map's adaptation potential. We introduce a new meta-evolution algorithm that discovers those behavioural features that maximise future adaptations. The proposed method applies Covariance Matrix Adaptation Evolution Strategy to evolve a population of behaviour-performance maps to maximise a meta-fitness function that rewards adaptation. The method stores solutions found by MAP-Elites in a database which allows to rapidly construct new behaviour-performance maps on-the-fly. To evaluate this system, we study the gait of the RHex robot as it adapts to a range of damages sustained on its legs. When compared to MAP-Elites with user-defined behaviour spaces, we demonstrate that the meta-evolution system learns high-performing gaits with or without damages injected to the robot.
This record has no associated files available for download.
More information
Published date: 25 June 2020
Additional Information:
Funding Information:
This work has been supported by the Engineering and Physical Sciences Research Council (EPSRC) under the New Investigator Award grant (EP/R030073/1), the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (GA no. 637972, project “ResiBots”) and the Lifelong Learning Machines program (L2M) from DARPA/MTO under Contract No. FA8750-18-C-0103.
Publisher Copyright:
© 2020 ACM.
Keywords:
Behavioural diversity, Damage recovery, Evolutionary robotics, Meta-learning, Quality-diversity algorithms
Identifiers
Local EPrints ID: 441277
URI: http://eprints.soton.ac.uk/id/eprint/441277
PURE UUID: af83dadc-2bd1-4cc9-8f89-62282b75f6c7
Catalogue record
Date deposited: 08 Jun 2020 16:31
Last modified: 17 Mar 2024 03:46
Export record
Altmetrics
Contributors
Author:
David Bossens
Author:
Jean-Baptiste Mouret
Author:
Danesh Tarapore
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics