Proactive multi-USV maritime search and rescue in stochastic wave environments: a hierarchical, non-causal reinforcement learning framework
Proactive multi-USV maritime search and rescue in stochastic wave environments: a hierarchical, non-causal reinforcement learning framework
Maritime search and rescue (SAR) in stochastic wave environments presents a critical challengefor multi-Unmanned Surface Vehicle (USV) systems and demands a fine balance betweensearch efficiency and operational safety. This paper proposes a novel hierarchical reinforcementlearning framework, termed Non-Causal Reward Multi-Agent Proximal Policy Optimization(NCR-MAPPO), to address this challenge. Our framework decouples the mission by employinga strategic guidance system based on International Maritime Organization (IMO) standards forsystematic coverage while a tactical motion controller built upon the Multi-Agent ProximalPolicy Optimization (MAPPO) algorithm learns cooperative execution. The core innovation isa Non-Causal Reward (NCR) mechanism that incorporates short-term wave field prediction intothe decision process, enabling a shift from reactive collision avoidance to proactive seakeepingcontrol. Through comprehensive simulations, we demonstrate the superiority of our framework.Compared to the standard MAPPO baseline, NCR-MAPPO significantly enhances survivabilityby reducing wave impact incidents by 27% and exposure to hazardous sea states by 25% whilemaintaining high mission efficiency. Thiswork provides a robust solution for autonomous marinesystems by bridging the gap between regulatory compliance and predictive safety control.
Song, Yutong
e40d4fb3-f448-4275-83c5-5dc3a423b7c5
Zeng, Tianyi
2d247d78-9b02-4acd-a7c5-166482833316
Zhang, Yao
1b512f22-e660-481d-ae60-31d87344625f
Tezdogan, Tahsin
7e7328e2-4185-4052-8e9a-53fd81c98909
Song, Yutong
e40d4fb3-f448-4275-83c5-5dc3a423b7c5
Zeng, Tianyi
2d247d78-9b02-4acd-a7c5-166482833316
Zhang, Yao
1b512f22-e660-481d-ae60-31d87344625f
Tezdogan, Tahsin
7e7328e2-4185-4052-8e9a-53fd81c98909
Song, Yutong, Zeng, Tianyi, Zhang, Yao and Tezdogan, Tahsin
(2026)
Proactive multi-USV maritime search and rescue in stochastic wave environments: a hierarchical, non-causal reinforcement learning framework.
Ocean Engineering.
(In Press)
Abstract
Maritime search and rescue (SAR) in stochastic wave environments presents a critical challengefor multi-Unmanned Surface Vehicle (USV) systems and demands a fine balance betweensearch efficiency and operational safety. This paper proposes a novel hierarchical reinforcementlearning framework, termed Non-Causal Reward Multi-Agent Proximal Policy Optimization(NCR-MAPPO), to address this challenge. Our framework decouples the mission by employinga strategic guidance system based on International Maritime Organization (IMO) standards forsystematic coverage while a tactical motion controller built upon the Multi-Agent ProximalPolicy Optimization (MAPPO) algorithm learns cooperative execution. The core innovation isa Non-Causal Reward (NCR) mechanism that incorporates short-term wave field prediction intothe decision process, enabling a shift from reactive collision avoidance to proactive seakeepingcontrol. Through comprehensive simulations, we demonstrate the superiority of our framework.Compared to the standard MAPPO baseline, NCR-MAPPO significantly enhances survivabilityby reducing wave impact incidents by 27% and exposure to hazardous sea states by 25% whilemaintaining high mission efficiency. Thiswork provides a robust solution for autonomous marinesystems by bridging the gap between regulatory compliance and predictive safety control.
Text
OE_MARL(unmarked)
- Accepted Manuscript
Restricted to Repository staff only until 31 March 2027.
Request a copy
More information
Accepted/In Press date: 31 March 2026
Identifiers
Local EPrints ID: 511465
URI: http://eprints.soton.ac.uk/id/eprint/511465
ISSN: 0029-8018
PURE UUID: 6ab809da-6446-4a9c-8ceb-ce0b9336a146
Catalogue record
Date deposited: 15 May 2026 16:45
Last modified: 16 May 2026 02:09
Export record
Contributors
Author:
Yutong Song
Author:
Tianyi Zeng
Author:
Yao Zhang
Author:
Tahsin Tezdogan
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics