COLERGs-constrained safe reinforcement learning for realising MASS's risk-informed collision avoidance decision making

Maritime autonomous surface ship (MASS) represents a significant advancement in maritime technology, offering the potential for increased efficiency, reduced operational costs, and enhanced maritime traffic safety. However, MASS navigation in complex maritime traffic and congested water areas presents challenges, especially in Collision Avoidance Decision Making (CADM) during multi-ship encounter scenarios. Through a robust risk assessment design for time-sequential and joint-target ships (TSs) encounter scenarios, a novel risk and reliability critic-enhanced safe hierarchical reinforcement learning (RA-SHRL), constrained by the International Regulations for Preventing Collisions at Sea (COLREGs), is proposed to realize the autonomous navigation and CADM of MASS. Finally, experimental simulations are conducted against a time-sequenced obstacle avoidance scenario and a swarm obstacle avoidance scenario. The experimental results demonstrate that RA-SHRL generates safe, efficient, and reliable collision avoidance strategies in both time-sequential dynamic obstacles and mixed joint-TSs environments. Additionally, the RA-SHRL is capable of assessing risk and avoiding multiple joint-TSs. Compared with Deep Q-network (DQN) and Constrained Policy Optimization (CPO), the search efficiency of the algorithm proposed in this paper is improved by 40% and 12%, respectively. Moreover, it achieved a 91.3% success rate of collision avoidance during training. The methodology could also benefit other autonomous systems in dynamic environments.

10.1016/j.knosys.2024.112205

0950-7051

Wang, Chengbo

08e72cc5-67a7-448b-ba20-46ae3ed588da

Zhang, Xinyu

3bf3c7d5-4670-4162-a9a8-3eebe4bb6c40

Gao, Hongbo

9af7d842-ea39-4d80-8051-3bddf4131647

Bashir, Musa

03146b14-6871-4656-8318-d25655175374

Li, Huanhuan

5e806b21-10a7-465c-9db3-32e466ae42f1

Yang, Zaili

82d4eebc-4532-4343-8555-35169e79bb6d

24 July 2024

Wang, Chengbo

08e72cc5-67a7-448b-ba20-46ae3ed588da

Zhang, Xinyu

3bf3c7d5-4670-4162-a9a8-3eebe4bb6c40

Gao, Hongbo

9af7d842-ea39-4d80-8051-3bddf4131647

Bashir, Musa

03146b14-6871-4656-8318-d25655175374

Li, Huanhuan

5e806b21-10a7-465c-9db3-32e466ae42f1

Yang, Zaili

82d4eebc-4532-4343-8555-35169e79bb6d

Wang, Chengbo, Zhang, Xinyu, Gao, Hongbo, Bashir, Musa, Li, Huanhuan and Yang, Zaili (2024) COLERGs-constrained safe reinforcement learning for realising MASS's risk-informed collision avoidance decision making. Knowledge-Based Systems, 300, [112205]. (doi:10.1016/j.knosys.2024.112205).

Record type: Article

Abstract

Text

KNOSYS-Accepted manuscript - Accepted Manuscript

Restricted to Repository staff only until 31 August 2026.

Request a copy

More information

Accepted/In Press date: 6 July 2024

e-pub ahead of print date: 16 July 2024

Published date: 24 July 2024

Identifiers

Local EPrints ID: 503689

URI: http://eprints.soton.ac.uk/id/eprint/503689

DOI: doi:10.1016/j.knosys.2024.112205

ISSN: 0950-7051

PURE UUID: 3c156f0b-bbd1-4c08-97a9-3a718719692d

ORCID for Huanhuan Li:

orcid.org/0000-0002-4293-4763

Catalogue record

Date deposited: 11 Aug 2025 16:31

Last modified: 22 Aug 2025 02:49

Export record

Altmetrics

Share this record

Share this on Facebook Share this on Twitter Share this on Weibo

Contributors

Author: Chengbo Wang

Author: Xinyu Zhang

Author: Hongbo Gao

Author: Musa Bashir

Author: Huanhuan Li

Author: Zaili Yang

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Library staff additional information