The University of Southampton
University of Southampton Institutional Repository

Deep reinforcement learning assisted UAV path planning relying on cumulative reward mode and region segmentation

Deep reinforcement learning assisted UAV path planning relying on cumulative reward mode and region segmentation
Deep reinforcement learning assisted UAV path planning relying on cumulative reward mode and region segmentation
In recent years, unmanned aerial vehicles (UAVs) have been considered for many applications, such as disaster prevention and control, logistics and transportation, and wireless communication. Most UAVs need to be manually controlled using remote control, which can be challenging in many environments. Therefore, autonomous UAVs have attracted significant research interest, where most of the existing autonomous navigation algorithms suffer from long computation time and unsatisfactory performance. Hence, we propose a Deep Reinforcement Learning (DRL) UAV path planning algorithm based on cumulative reward and region segmentation. Our proposed region segmentation aims to reduce the probability of DRL agents falling into local optimal trap, while our proposed cumulative reward model takes into account the distance from the node to the destination and the density of obstacles near the node, which solves the problem of sparse training data faced by the DRL algorithms in the path planning task.
The proposed region segmentation algorithm and cumulative reward model have been tested in different DRL techniques, where we show that the cumulative reward model can improve the training efficiency of deep neural networks by 30.8% and the region segmentation algorithm enables deep Q-network agent
to avoid 99% of local optimal traps and assists deep deterministic policy gradient agent to avoid 92% of local optimal traps.
Autonomous Navigation, Autonomous aerial vehicles, Autonomous robots, Cumulative Reward Model, Deep Reinforcement Learning, Deep reinforcement learning, Experience Replay, Heuristic algorithms, Navigation, Path planning, Region Segmentation, Training, UAV path planning, Autonomous navigation, region segmentation, cumulative reward model, deep reinforcement learning, experience replay
2644-1330
737-751
Wang, Zhipeng
feb79a9c-caba-4f0c-a561-dff6447aae64
Ng, Soon Xin
e19a63b0-0f12-4591-ab5f-554820d5f78c
El-Hajjar, Mohammed
3a829028-a427-4123-b885-2bab81a44b6f
Wang, Zhipeng
feb79a9c-caba-4f0c-a561-dff6447aae64
Ng, Soon Xin
e19a63b0-0f12-4591-ab5f-554820d5f78c
El-Hajjar, Mohammed
3a829028-a427-4123-b885-2bab81a44b6f

Wang, Zhipeng, Ng, Soon Xin and El-Hajjar, Mohammed (2024) Deep reinforcement learning assisted UAV path planning relying on cumulative reward mode and region segmentation. IEEE Open Journal of Vehicular Technology, 5, 737-751. (doi:10.1109/OJVT.2024.3402129).

Record type: Article

Abstract

In recent years, unmanned aerial vehicles (UAVs) have been considered for many applications, such as disaster prevention and control, logistics and transportation, and wireless communication. Most UAVs need to be manually controlled using remote control, which can be challenging in many environments. Therefore, autonomous UAVs have attracted significant research interest, where most of the existing autonomous navigation algorithms suffer from long computation time and unsatisfactory performance. Hence, we propose a Deep Reinforcement Learning (DRL) UAV path planning algorithm based on cumulative reward and region segmentation. Our proposed region segmentation aims to reduce the probability of DRL agents falling into local optimal trap, while our proposed cumulative reward model takes into account the distance from the node to the destination and the density of obstacles near the node, which solves the problem of sparse training data faced by the DRL algorithms in the path planning task.
The proposed region segmentation algorithm and cumulative reward model have been tested in different DRL techniques, where we show that the cumulative reward model can improve the training efficiency of deep neural networks by 30.8% and the region segmentation algorithm enables deep Q-network agent
to avoid 99% of local optimal traps and assists deep deterministic policy gradient agent to avoid 92% of local optimal traps.

Text
paper - Accepted Manuscript
Available under License Creative Commons Attribution.
Download (4MB)
Text
Deep_Reinforcement_Learning_Assisted_UAV_Path_Planning_Relying_on_Cumulative_Reward_mode_and_Region_Segmentation - Version of Record
Available under License Creative Commons Attribution.
Download (4MB)

More information

Accepted/In Press date: 14 May 2024
e-pub ahead of print date: 16 May 2024
Published date: 16 May 2024
Additional Information: Publisher Copyright: © 2020 IEEE.
Keywords: Autonomous Navigation, Autonomous aerial vehicles, Autonomous robots, Cumulative Reward Model, Deep Reinforcement Learning, Deep reinforcement learning, Experience Replay, Heuristic algorithms, Navigation, Path planning, Region Segmentation, Training, UAV path planning, Autonomous navigation, region segmentation, cumulative reward model, deep reinforcement learning, experience replay

Identifiers

Local EPrints ID: 490238
URI: http://eprints.soton.ac.uk/id/eprint/490238
ISSN: 2644-1330
PURE UUID: 7e95f57a-fb0e-4ee7-a3c4-731385ad2f9e
ORCID for Soon Xin Ng: ORCID iD orcid.org/0000-0002-0930-7194
ORCID for Mohammed El-Hajjar: ORCID iD orcid.org/0000-0002-7987-1401

Catalogue record

Date deposited: 20 May 2024 17:45
Last modified: 12 Jul 2024 01:49

Export record

Altmetrics

Contributors

Author: Zhipeng Wang
Author: Soon Xin Ng ORCID iD
Author: Mohammed El-Hajjar ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×