The University of Southampton
University of Southampton Institutional Repository

Advanced autonomous collision avoidance for maritime navigation: A reinforcement learning approach with ship dynamics and environmental awareness

Advanced autonomous collision avoidance for maritime navigation: A reinforcement learning approach with ship dynamics and environmental awareness
Advanced autonomous collision avoidance for maritime navigation: A reinforcement learning approach with ship dynamics and environmental awareness

Autonomous collision avoidance is critical for ensuring the safety and efficiency of maritime navigation. However, existing approaches often struggle to achieve realistic manoeuvrability, robust generalisation, and compliance with the Convention on the International Regulations for Preventing Collisions at Sea (COLREGs). To address these challenges, this study proposes a Reinforcement Learning (RL)-based collision avoidance framework, integrating three key innovations. Firstly, a discrete action space is designed to accurately capture the rudder control characteristics commonly used in real maritime operations. This is integrated with a Manoeuvring Modelling Group (MMG) model, ensuring that the generated trajectories are dynamically feasible and operationally realistic. Secondly, a multi-dimensional reward function is developed, incorporating collision risk, distance to target, navigational efficiency, operational comfort, and compliance with COLREGs. This is further supported by a line-of-sight (LOS) tracking mechanism, which stabilises heading corrections based on dynamic path requirements, significantly improving the agent’s course-keeping ability. Finally, the framework includes a robust generalisation strategy, using polygonal obstacle modelling to represent complex, irregular hazards more accurately. This is combined with real-world bathymetric data and multi-ship encounters for rigorous validation, ensuring the system can operate effectively in uncertain, multi-agent, and non-cooperative environments. The proposed model is trained using the Phasic Policy Gradient (PPG) algorithm within an Actor-Critic (AC) architecture, enabling robust policy learning under uncertainty. Simulation results demonstrate that the framework effectively reduces collision risk, maintains stable trajectories, and adheres to COLREGs, making it a practical and scalable solution for next-generation autonomous ship navigation.

Autonomous ship navigation, Autonomousship, Collision risk, Reinforcement learning, Ship collision avoidance
1366-5545
Yang, Lichao
1ec48708-9fc2-4077-98a3-9ce2d51f06a5
Liu, Jingxian
0cd82a7d-41c8-4da2-9826-e01cb1685b1c
Zhou, Qin
22cc3c1b-50f4-41e0-9c3e-8cdf183a022e
Liu, Zhao
68f8f0b4-bd89-4c3b-8b40-97e708133f4f
Wang, Yukuan
e53a38f1-42b6-46c1-b1cd-87a7304c1b9b
Liu, Yang
b7638d22-5d16-4ec7-b6f5-176f7d9c0e84
Li, Xuejiao
d5b897f6-b8a3-48e3-9e33-7d00257c55f6
Li, Huanhuan
5e806b21-10a7-465c-9db3-32e466ae42f1
Yang, Lichao
1ec48708-9fc2-4077-98a3-9ce2d51f06a5
Liu, Jingxian
0cd82a7d-41c8-4da2-9826-e01cb1685b1c
Zhou, Qin
22cc3c1b-50f4-41e0-9c3e-8cdf183a022e
Liu, Zhao
68f8f0b4-bd89-4c3b-8b40-97e708133f4f
Wang, Yukuan
e53a38f1-42b6-46c1-b1cd-87a7304c1b9b
Liu, Yang
b7638d22-5d16-4ec7-b6f5-176f7d9c0e84
Li, Xuejiao
d5b897f6-b8a3-48e3-9e33-7d00257c55f6
Li, Huanhuan
5e806b21-10a7-465c-9db3-32e466ae42f1

Yang, Lichao, Liu, Jingxian, Zhou, Qin, Liu, Zhao, Wang, Yukuan, Liu, Yang, Li, Xuejiao and Li, Huanhuan (2026) Advanced autonomous collision avoidance for maritime navigation: A reinforcement learning approach with ship dynamics and environmental awareness. Transportation Research Part E: Logistics and Transportation Review, 212, [104901]. (doi:10.1016/j.tre.2026.104901).

Record type: Article

Abstract

Autonomous collision avoidance is critical for ensuring the safety and efficiency of maritime navigation. However, existing approaches often struggle to achieve realistic manoeuvrability, robust generalisation, and compliance with the Convention on the International Regulations for Preventing Collisions at Sea (COLREGs). To address these challenges, this study proposes a Reinforcement Learning (RL)-based collision avoidance framework, integrating three key innovations. Firstly, a discrete action space is designed to accurately capture the rudder control characteristics commonly used in real maritime operations. This is integrated with a Manoeuvring Modelling Group (MMG) model, ensuring that the generated trajectories are dynamically feasible and operationally realistic. Secondly, a multi-dimensional reward function is developed, incorporating collision risk, distance to target, navigational efficiency, operational comfort, and compliance with COLREGs. This is further supported by a line-of-sight (LOS) tracking mechanism, which stabilises heading corrections based on dynamic path requirements, significantly improving the agent’s course-keeping ability. Finally, the framework includes a robust generalisation strategy, using polygonal obstacle modelling to represent complex, irregular hazards more accurately. This is combined with real-world bathymetric data and multi-ship encounters for rigorous validation, ensuring the system can operate effectively in uncertain, multi-agent, and non-cooperative environments. The proposed model is trained using the Phasic Policy Gradient (PPG) algorithm within an Actor-Critic (AC) architecture, enabling robust policy learning under uncertainty. Simulation results demonstrate that the framework effectively reduces collision risk, maintains stable trajectories, and adheres to COLREGs, making it a practical and scalable solution for next-generation autonomous ship navigation.

Text
ESREL-SRA-E2025-P0401 - Version of Record
Available under License Creative Commons Attribution.
Download (187kB)

More information

Accepted/In Press date: 20 April 2026
e-pub ahead of print date: 28 April 2026
Published date: 1 August 2026
Additional Information: Publisher Copyright: © 2026 The Author(s).
Keywords: Autonomous ship navigation, Autonomousship, Collision risk, Reinforcement learning, Ship collision avoidance

Identifiers

Local EPrints ID: 511423
URI: http://eprints.soton.ac.uk/id/eprint/511423
ISSN: 1366-5545
PURE UUID: 13cf1637-9383-4c18-b969-c42088f3f2c3
ORCID for Qin Zhou: ORCID iD orcid.org/0000-0002-0273-6295
ORCID for Huanhuan Li: ORCID iD orcid.org/0000-0002-4293-4763

Catalogue record

Date deposited: 14 May 2026 16:35
Last modified: 15 May 2026 02:13

Export record

Altmetrics

Contributors

Author: Lichao Yang
Author: Jingxian Liu
Author: Qin Zhou ORCID iD
Author: Zhao Liu
Author: Yukuan Wang
Author: Yang Liu
Author: Xuejiao Li
Author: Huanhuan Li ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×