The University of Southampton
University of Southampton Institutional Repository

Proactive multi-USV maritime search and rescue in stochastic wave environments: a hierarchical, non-causal reinforcement learning framework

Proactive multi-USV maritime search and rescue in stochastic wave environments: a hierarchical, non-causal reinforcement learning framework
Proactive multi-USV maritime search and rescue in stochastic wave environments: a hierarchical, non-causal reinforcement learning framework
Maritime search and rescue (SAR) in stochastic wave environments presents a critical challengefor multi-Unmanned Surface Vehicle (USV) systems and demands a fine balance betweensearch efficiency and operational safety. This paper proposes a novel hierarchical reinforcementlearning framework, termed Non-Causal Reward Multi-Agent Proximal Policy Optimization(NCR-MAPPO), to address this challenge. Our framework decouples the mission by employinga strategic guidance system based on International Maritime Organization (IMO) standards forsystematic coverage while a tactical motion controller built upon the Multi-Agent ProximalPolicy Optimization (MAPPO) algorithm learns cooperative execution. The core innovation isa Non-Causal Reward (NCR) mechanism that incorporates short-term wave field prediction intothe decision process, enabling a shift from reactive collision avoidance to proactive seakeepingcontrol. Through comprehensive simulations, we demonstrate the superiority of our framework.Compared to the standard MAPPO baseline, NCR-MAPPO significantly enhances survivabilityby reducing wave impact incidents by 27% and exposure to hazardous sea states by 25% whilemaintaining high mission efficiency. Thiswork provides a robust solution for autonomous marinesystems by bridging the gap between regulatory compliance and predictive safety control.
0029-8018
Song, Yutong
e40d4fb3-f448-4275-83c5-5dc3a423b7c5
Zeng, Tianyi
2d247d78-9b02-4acd-a7c5-166482833316
Zhang, Yao
1b512f22-e660-481d-ae60-31d87344625f
Tezdogan, Tahsin
7e7328e2-4185-4052-8e9a-53fd81c98909
Song, Yutong
e40d4fb3-f448-4275-83c5-5dc3a423b7c5
Zeng, Tianyi
2d247d78-9b02-4acd-a7c5-166482833316
Zhang, Yao
1b512f22-e660-481d-ae60-31d87344625f
Tezdogan, Tahsin
7e7328e2-4185-4052-8e9a-53fd81c98909

Song, Yutong, Zeng, Tianyi, Zhang, Yao and Tezdogan, Tahsin (2026) Proactive multi-USV maritime search and rescue in stochastic wave environments: a hierarchical, non-causal reinforcement learning framework. Ocean Engineering. (In Press)

Record type: Article

Abstract

Maritime search and rescue (SAR) in stochastic wave environments presents a critical challengefor multi-Unmanned Surface Vehicle (USV) systems and demands a fine balance betweensearch efficiency and operational safety. This paper proposes a novel hierarchical reinforcementlearning framework, termed Non-Causal Reward Multi-Agent Proximal Policy Optimization(NCR-MAPPO), to address this challenge. Our framework decouples the mission by employinga strategic guidance system based on International Maritime Organization (IMO) standards forsystematic coverage while a tactical motion controller built upon the Multi-Agent ProximalPolicy Optimization (MAPPO) algorithm learns cooperative execution. The core innovation isa Non-Causal Reward (NCR) mechanism that incorporates short-term wave field prediction intothe decision process, enabling a shift from reactive collision avoidance to proactive seakeepingcontrol. Through comprehensive simulations, we demonstrate the superiority of our framework.Compared to the standard MAPPO baseline, NCR-MAPPO significantly enhances survivabilityby reducing wave impact incidents by 27% and exposure to hazardous sea states by 25% whilemaintaining high mission efficiency. Thiswork provides a robust solution for autonomous marinesystems by bridging the gap between regulatory compliance and predictive safety control.

Text
OE_MARL(unmarked) - Accepted Manuscript
Restricted to Repository staff only until 31 March 2027.
Request a copy

More information

Accepted/In Press date: 31 March 2026

Identifiers

Local EPrints ID: 511465
URI: http://eprints.soton.ac.uk/id/eprint/511465
ISSN: 0029-8018
PURE UUID: 6ab809da-6446-4a9c-8ceb-ce0b9336a146
ORCID for Tahsin Tezdogan: ORCID iD orcid.org/0000-0002-7032-3038

Catalogue record

Date deposited: 15 May 2026 16:45
Last modified: 16 May 2026 02:09

Export record

Contributors

Author: Yutong Song
Author: Tianyi Zeng
Author: Yao Zhang
Author: Tahsin Tezdogan ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×