The University of Southampton
University of Southampton Institutional Repository

MHNet: a hybrid network for high-resolution remote sensing image semantic segmentation based on multiscale feature fusion

MHNet: a hybrid network for high-resolution remote sensing image semantic segmentation based on multiscale feature fusion
MHNet: a hybrid network for high-resolution remote sensing image semantic segmentation based on multiscale feature fusion

Semantic segmentation of high-resolution remote sensing images (HRSIs) presents significant challenges, such as discrete object distributions, diverse scales, and class imbalance, which lead to problems like blurred boundary segmentation and insufficient global semantic associations. Although traditional convolutional neural networks are excellent in local feature extraction, their inherent structure limits the modeling of long-range dependencies. Transformer can model global context, but the quadratic complexity of the self-attention mechanism leads to high computational costs when dealing with HRSIs. Therefore, this manuscript proposes a novel encoder-decoder network called Multiscale Hybrid Network (MHNet) to effectively improve the segmentation performance of HRSIs through multiscale feature fusion, global context modeling, and boundary detail optimization. Specifically, in the encoder, the Neighborhood Feature Fusion (NFF) module is designed to fuse neighboring layer features, and aggregate low-level details and high-level semantics by channel and spatial attention. For the decoder, the Multiscale Refinement Enhanced Transformer Block (MRETB) and the Multiscale Refinement Attention Fusion (MRAF) module are proposed. Among them, MRETB uses the Multiscale Refinement Enhancement (MSRE) module to extract multiscale features and enhance boundary information, and the Window-based Efficient Multi-Head Self-Attention Mechanism (W-EMSA) to model long-range dependencies. MRAF further integrates the multiscale global context output by MRETB through integrating multilayer features and optimizing boundary details. The performance of MHNet is verified by experiments conducted on three public remote sensing image datasets.

Boundary refined, Multiscale, Remote sensing, Semantic segmentation, Transformer
1051-2004
Zeng, Qiaolin
8c68a15a-12b1-4653-8843-5a7be9e41acc
Chen, Shitong
affb4331-bf0f-49cd-84e9-975cfea529d3
Fan, Meng
7b281f11-91f7-4a2b-97d5-d707591ab50c
Chen, Liangfu
cba91e61-e0e7-41c0-818b-53b199deeb38
Zhu, Songyan
122e3311-4c1f-48e9-8aa3-09fcbe990cd9
Zhou, Jingxiang
621d352b-8850-43ee-b7f7-38c122591c3d
Zeng, Qiaolin
8c68a15a-12b1-4653-8843-5a7be9e41acc
Chen, Shitong
affb4331-bf0f-49cd-84e9-975cfea529d3
Fan, Meng
7b281f11-91f7-4a2b-97d5-d707591ab50c
Chen, Liangfu
cba91e61-e0e7-41c0-818b-53b199deeb38
Zhu, Songyan
122e3311-4c1f-48e9-8aa3-09fcbe990cd9
Zhou, Jingxiang
621d352b-8850-43ee-b7f7-38c122591c3d

Zeng, Qiaolin, Chen, Shitong, Fan, Meng, Chen, Liangfu, Zhu, Songyan and Zhou, Jingxiang (2026) MHNet: a hybrid network for high-resolution remote sensing image semantic segmentation based on multiscale feature fusion. Digital Signal Processing, 175, [106014]. (doi:10.1016/j.dsp.2026.106014).

Record type: Article

Abstract

Semantic segmentation of high-resolution remote sensing images (HRSIs) presents significant challenges, such as discrete object distributions, diverse scales, and class imbalance, which lead to problems like blurred boundary segmentation and insufficient global semantic associations. Although traditional convolutional neural networks are excellent in local feature extraction, their inherent structure limits the modeling of long-range dependencies. Transformer can model global context, but the quadratic complexity of the self-attention mechanism leads to high computational costs when dealing with HRSIs. Therefore, this manuscript proposes a novel encoder-decoder network called Multiscale Hybrid Network (MHNet) to effectively improve the segmentation performance of HRSIs through multiscale feature fusion, global context modeling, and boundary detail optimization. Specifically, in the encoder, the Neighborhood Feature Fusion (NFF) module is designed to fuse neighboring layer features, and aggregate low-level details and high-level semantics by channel and spatial attention. For the decoder, the Multiscale Refinement Enhanced Transformer Block (MRETB) and the Multiscale Refinement Attention Fusion (MRAF) module are proposed. Among them, MRETB uses the Multiscale Refinement Enhancement (MSRE) module to extract multiscale features and enhance boundary information, and the Window-based Efficient Multi-Head Self-Attention Mechanism (W-EMSA) to model long-range dependencies. MRAF further integrates the multiscale global context output by MRETB through integrating multilayer features and optimizing boundary details. The performance of MHNet is verified by experiments conducted on three public remote sensing image datasets.

Text
manuscript 2 - Accepted Manuscript
Restricted to Repository staff only until 21 February 2027.
Request a copy

More information

e-pub ahead of print date: 21 February 2026
Published date: 27 February 2026
Keywords: Boundary refined, Multiscale, Remote sensing, Semantic segmentation, Transformer

Identifiers

Local EPrints ID: 511319
URI: http://eprints.soton.ac.uk/id/eprint/511319
ISSN: 1051-2004
PURE UUID: 00098d2f-dcd3-442f-b803-5c1b91e57a1c
ORCID for Songyan Zhu: ORCID iD orcid.org/0000-0001-6899-9920

Catalogue record

Date deposited: 12 May 2026 16:31
Last modified: 13 May 2026 02:12

Export record

Altmetrics

Contributors

Author: Qiaolin Zeng
Author: Shitong Chen
Author: Meng Fan
Author: Liangfu Chen
Author: Songyan Zhu ORCID iD
Author: Jingxiang Zhou

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×