The University of Southampton
University of Southampton Institutional Repository

UWStereo: a large synthetic dataset for underwater stereo matching

UWStereo: a large synthetic dataset for underwater stereo matching
UWStereo: a large synthetic dataset for underwater stereo matching
Despite recent advances in stereo matching, the extension to intricate underwater settings remains unexplored, primarily owing to: 1) the reduced visibility, low contrast, and other adverse effects of underwater images; 2) the difficulty in obtaining ground truth data for training deep learning models, i.e., simultaneously capturing an image and estimating its corresponding pixel-wise depth information in underwater environments. To enable further advance in underwater stereo matching, we introduce a large synthetic dataset called UWStereo. Our dataset includes 29,568 synthetic stereo image pairs with dense and accurate disparity annotations for left view. We design four distinct underwater scenes filled with diverse objects such as corals, ships and robots. We also induce additional variations in camera model, lighting, and environmental effects. In comparison with existing underwater datasets, UWStereo is superior in terms of scale, variation, annotation, and photo-realistic image quality. To substantiate the efficacy of the UWStereo dataset, we undertake a comprehensive evaluation compared with eleven state-of-the-art algorithms as benchmarks. The results indicate that current models still struggle to generalize to new domains. Hence, we design a new strategy that learns to reconstruct cross domain masked images before stereo matching training and integrate a cross view attention enhancement module that aggregates longrange content information to enhance the generalization ability.
1558-2205
Lv, Qingxuan
09dec60c-48fb-420d-b0e8-5a176a474abf
Dong, Junyu
ef350fb2-8682-4a0a-b60e-ebcb7f55085f
Li, Yuezun
f95883a5-3aeb-42ff-ae79-11ea2da9e1e2
Chen, Sheng
9310a111-f79a-48b8-98c7-383ca93cbb80
Yu, Hui
62623ded-fe42-4211-9529-ff32de116743
Zhang, Shu
b53f5582-0ebe-4fb1-b522-4d741943c8f2
Wang, Wenhan
5fdb5c9d-7b2a-4f89-98cc-edc6d550053f
Lv, Qingxuan
09dec60c-48fb-420d-b0e8-5a176a474abf
Dong, Junyu
ef350fb2-8682-4a0a-b60e-ebcb7f55085f
Li, Yuezun
f95883a5-3aeb-42ff-ae79-11ea2da9e1e2
Chen, Sheng
9310a111-f79a-48b8-98c7-383ca93cbb80
Yu, Hui
62623ded-fe42-4211-9529-ff32de116743
Zhang, Shu
b53f5582-0ebe-4fb1-b522-4d741943c8f2
Wang, Wenhan
5fdb5c9d-7b2a-4f89-98cc-edc6d550053f

Lv, Qingxuan, Dong, Junyu, Li, Yuezun, Chen, Sheng, Yu, Hui, Zhang, Shu and Wang, Wenhan (2025) UWStereo: a large synthetic dataset for underwater stereo matching. IEEE Transactions on Circuits and Systems for Video Technology. (doi:10.1109/TCSVT.2025.3572044).

Record type: Article

Abstract

Despite recent advances in stereo matching, the extension to intricate underwater settings remains unexplored, primarily owing to: 1) the reduced visibility, low contrast, and other adverse effects of underwater images; 2) the difficulty in obtaining ground truth data for training deep learning models, i.e., simultaneously capturing an image and estimating its corresponding pixel-wise depth information in underwater environments. To enable further advance in underwater stereo matching, we introduce a large synthetic dataset called UWStereo. Our dataset includes 29,568 synthetic stereo image pairs with dense and accurate disparity annotations for left view. We design four distinct underwater scenes filled with diverse objects such as corals, ships and robots. We also induce additional variations in camera model, lighting, and environmental effects. In comparison with existing underwater datasets, UWStereo is superior in terms of scale, variation, annotation, and photo-realistic image quality. To substantiate the efficacy of the UWStereo dataset, we undertake a comprehensive evaluation compared with eleven state-of-the-art algorithms as benchmarks. The results indicate that current models still struggle to generalize to new domains. Hence, we design a new strategy that learns to reconstruct cross domain masked images before stereo matching training and integrate a cross view attention enhancement module that aggregates longrange content information to enhance the generalization ability.

Text
TCSVT_UWStereo - Accepted Manuscript
Available under License Creative Commons Attribution.
Download (3MB)

More information

Accepted/In Press date: 13 May 2025
e-pub ahead of print date: 21 May 2025

Identifiers

Local EPrints ID: 502208
URI: http://eprints.soton.ac.uk/id/eprint/502208
ISSN: 1558-2205
PURE UUID: 6718ba46-7310-4e6a-a410-97e0829a80ce

Catalogue record

Date deposited: 18 Jun 2025 16:37
Last modified: 18 Jun 2025 16:37

Export record

Altmetrics

Contributors

Author: Qingxuan Lv
Author: Junyu Dong
Author: Yuezun Li
Author: Sheng Chen
Author: Hui Yu
Author: Shu Zhang
Author: Wenhan Wang

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×