The University of Southampton
University of Southampton Institutional Repository

GeoCLR: georeference contrastive learning for efficient seafloor image interpretation

GeoCLR: georeference contrastive learning for efficient seafloor image interpretation
GeoCLR: georeference contrastive learning for efficient seafloor image interpretation
This paper describes Georeference Contrastive Learning of visual Representation
(GeoCLR) for efficient training of deep-learning Convolutional Neural Networks (CNNs). The method leverages georeference information by generating a similar image pair using images taken of nearby locations, and contrasting these with an image pair that is far apart. The underlying assumption is that images gathered within a close distance are more likely to have similar visual appearance, where this can be reasonably satisfied in seafloor robotic imaging applications where image footprints are limited to edge lengths of a few metres and are taken so that they overlap along a vehicle’s trajectory, whereas seafloor substrates and habitats have patch sizes that are far larger. A key advantage of this method is that it is self-supervised and does not require any human input for CNN training. The method is computationally efficient, where results can be generated between dives during multi-day AUV missions using computational resources that would be accessible during most oceanic field trials. We apply GeoCLR to habitat classification on a dataset that consists of ~86k images gathered using an Autonomous Underwater Vehicle (AUV). We demonstrate how the latent representations generated by GeoCLR can be used to efficiently guide human annotation efforts, where the semi-supervised framework improves classification accuracy by an average of 10.2% compared to the state-of-the-art SimCLR using the same CNN and equivalent number of human annotations for training.
1134 - 1155
Yamada, Takaki
81c66c35-0e2b-4342-80fa-cbee6ff9ce5f
Prugel-Bennett, Adam
b107a151-1751-4d8b-b8db-2c395ac4e14e
Pizarro, Oscar
45a78dfb-5f85-4595-bfc8-d1ac46f2e7bb
Williams, Stefan B.
c9477238-5139-4b74-804c-3b9b464f6949
Thornton, Blair
8293beb5-c083-47e3-b5f0-d9c3cee14be9
Yamada, Takaki
81c66c35-0e2b-4342-80fa-cbee6ff9ce5f
Prugel-Bennett, Adam
b107a151-1751-4d8b-b8db-2c395ac4e14e
Pizarro, Oscar
45a78dfb-5f85-4595-bfc8-d1ac46f2e7bb
Williams, Stefan B.
c9477238-5139-4b74-804c-3b9b464f6949
Thornton, Blair
8293beb5-c083-47e3-b5f0-d9c3cee14be9

Yamada, Takaki, Prugel-Bennett, Adam, Pizarro, Oscar, Williams, Stefan B. and Thornton, Blair (2022) GeoCLR: georeference contrastive learning for efficient seafloor image interpretation. Field Robotics, 2, 1134 - 1155. (doi:10.55417/fr.2022037).

Record type: Article

Abstract

This paper describes Georeference Contrastive Learning of visual Representation
(GeoCLR) for efficient training of deep-learning Convolutional Neural Networks (CNNs). The method leverages georeference information by generating a similar image pair using images taken of nearby locations, and contrasting these with an image pair that is far apart. The underlying assumption is that images gathered within a close distance are more likely to have similar visual appearance, where this can be reasonably satisfied in seafloor robotic imaging applications where image footprints are limited to edge lengths of a few metres and are taken so that they overlap along a vehicle’s trajectory, whereas seafloor substrates and habitats have patch sizes that are far larger. A key advantage of this method is that it is self-supervised and does not require any human input for CNN training. The method is computationally efficient, where results can be generated between dives during multi-day AUV missions using computational resources that would be accessible during most oceanic field trials. We apply GeoCLR to habitat classification on a dataset that consists of ~86k images gathered using an Autonomous Underwater Vehicle (AUV). We demonstrate how the latent representations generated by GeoCLR can be used to efficiently guide human annotation efforts, where the semi-supervised framework improves classification accuracy by an average of 10.2% compared to the state-of-the-art SimCLR using the same CNN and equivalent number of human annotations for training.

Text
Yamada_2022_FR - Accepted Manuscript
Download (6MB)
Text
Yamada_2022_FR - Version of Record
Available under License Creative Commons Attribution.
Download (6MB)

More information

Accepted/In Press date: 20 April 2022
e-pub ahead of print date: 9 June 2022

Identifiers

Local EPrints ID: 456914
URI: http://eprints.soton.ac.uk/id/eprint/456914
PURE UUID: 394bbe96-f786-44fb-a66e-f7002dd60b2e
ORCID for Takaki Yamada: ORCID iD orcid.org/0000-0002-5090-7239

Catalogue record

Date deposited: 17 May 2022 16:37
Last modified: 17 Mar 2024 07:16

Export record

Altmetrics

Contributors

Author: Takaki Yamada ORCID iD
Author: Adam Prugel-Bennett
Author: Oscar Pizarro
Author: Stefan B. Williams
Author: Blair Thornton

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×