The University of Southampton
University of Southampton Institutional Repository

Unmanned aerial vehicle pitch control under delay using deep reinforcement learning with continuous action in wind tunnel test

Unmanned aerial vehicle pitch control under delay using deep reinforcement learning with continuous action in wind tunnel test
Unmanned aerial vehicle pitch control under delay using deep reinforcement learning with continuous action in wind tunnel test
Nonlinear flight controllers for fixed-wing unmanned aerial vehicles (UAVs) can potentially be developed using deep reinforcement learning. However, there is often a reality gap between the simulation models used to train these controllers and the real world. This study experimentally investigated the application of deep reinforcement learning to the pitch control of a UAV in wind tunnel tests, with a particular focus of investigating the effect of time delays on flight controller performance. Multiple neural networks were trained in simulation with different assumed time delays and then wind tunnel tested. The neural networks trained with shorter delays tended to be susceptible to delay in the real tests and produce fluctuating behaviour. The neural networks trained with longer delays behaved more conservatively and did not produce oscillations but suffered steady state errors under some conditions due to unmodeled frictional effects. These results highlight the importance of performing physical experiments to validate controller performance and how the training approach used with reinforcement learning needs to be robust to reality gaps between simulation and the real world.
2226-4310
Wada, Daichi
8d26cac9-4ad4-4c9c-916c-ab6e71148cb9
Araujo-Estrada, Sergio A.
87793c63-f2bd-4169-b93d-ec1525909a7a
Windsor, Shane
a13e20ea-eb52-412b-982d-9a1ab9838b4a
Wada, Daichi
8d26cac9-4ad4-4c9c-916c-ab6e71148cb9
Araujo-Estrada, Sergio A.
87793c63-f2bd-4169-b93d-ec1525909a7a
Windsor, Shane
a13e20ea-eb52-412b-982d-9a1ab9838b4a

Wada, Daichi, Araujo-Estrada, Sergio A. and Windsor, Shane (2021) Unmanned aerial vehicle pitch control under delay using deep reinforcement learning with continuous action in wind tunnel test. Aerospace, 8 (9). (doi:10.3390/aerospace8090258).

Record type: Article

Abstract

Nonlinear flight controllers for fixed-wing unmanned aerial vehicles (UAVs) can potentially be developed using deep reinforcement learning. However, there is often a reality gap between the simulation models used to train these controllers and the real world. This study experimentally investigated the application of deep reinforcement learning to the pitch control of a UAV in wind tunnel tests, with a particular focus of investigating the effect of time delays on flight controller performance. Multiple neural networks were trained in simulation with different assumed time delays and then wind tunnel tested. The neural networks trained with shorter delays tended to be susceptible to delay in the real tests and produce fluctuating behaviour. The neural networks trained with longer delays behaved more conservatively and did not produce oscillations but suffered steady state errors under some conditions due to unmodeled frictional effects. These results highlight the importance of performing physical experiments to validate controller performance and how the training approach used with reinforcement learning needs to be robust to reality gaps between simulation and the real world.

Text
aerospace-08-00258-v2 - Version of Record
Available under License Creative Commons Attribution.
Download (2MB)

More information

Accepted/In Press date: 10 September 2021
Published date: 11 September 2021

Identifiers

Local EPrints ID: 469032
URI: http://eprints.soton.ac.uk/id/eprint/469032
ISSN: 2226-4310
PURE UUID: 7364e536-6f47-4477-942e-e487e99fd437
ORCID for Sergio A. Araujo-Estrada: ORCID iD orcid.org/0000-0002-5432-5842

Catalogue record

Date deposited: 05 Sep 2022 16:55
Last modified: 17 Mar 2024 04:12

Export record

Altmetrics

Contributors

Author: Daichi Wada
Author: Sergio A. Araujo-Estrada ORCID iD
Author: Shane Windsor

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×