The University of Southampton
University of Southampton Institutional Repository

# Learning Complex Policy Distribution with CEM Guided Adversarial Hypernetwork

Oliehoek, Frans, Tang, Shi Yuan and Zhang, Jie (2021) Learning Complex Policy Distribution with CEM Guided Adversarial Hypernetwork. Tenth International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2011), , Taipei. 01 - 05 May 2011. pp. 1296-1304 . .

Record type: Conference or Workshop Item (Paper)

## Abstract

Cross-Entropy Method (CEM) is a gradient-free direct policy search method, which has greater stability and is insensitive to hyper-parameter tuning. CEM bears similarity to population-based evolutionary methods, but, rather than using a population it uses a distribution over candidate solutions (policies in our case). Usually, a natural exponential family distribution such as multivariate Gaussian is used to parameterize the policy distribution. Using a multivariate Gaussian limits the quality of CEM policies as the search becomes confined to a less representative subspace. We address this drawback by using an adversarially-trained hypernetwork, enabling a richer and complex representation of the policy distribution. To achieve better training stability and faster convergence, we use a multivariate Gaussian CEM policy to guide our adversarial training process. Experiments demonstrate that our approach outperforms state-of-the-art CEM-based methods by $15.8%$ in terms of rewards while achieving faster convergence. Results also show that our approach is less sensitive to hyper-parameters than other deep-RL methods such as REINFORCE, DDPG and DQN.

Full text not available from this repository.

Published date: 4 May 2021
Venue - Dates: Tenth International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2011), , Taipei, 2011-05-01 - 2011-05-05
Keywords: Cross-Entropy Method, Generative Adversarial Networks, Hypernetworks, Reinforcement Learning

## Identifiers

Local EPrints ID: 451466
URI: http://eprints.soton.ac.uk/id/eprint/451466
PURE UUID: 9cd5a9d1-5be1-4d98-8a39-c19962f50bd2
ORCID for Jie Zhang: orcid.org/0000-0002-5348-7671

## Catalogue record

Date deposited: 29 Sep 2021 19:06

## Contributors

Author: Frans Oliehoek
Author: Shi Yuan Tang
Author: Jie Zhang