Leveraging domain knowledge in machine learning for seafloor image interpretation
Leveraging domain knowledge in machine learning for seafloor image interpretation
This thesis develops a method to incorporate domain knowledge into modern machine learning techniques when interpreting large volumes of robotically obtained seafloor imagery. Deep learning has the potential to automate tasks such as habitat and animal recognition in marine monitoring. However, the large input of human effort needed to train the models is a bottleneck, and this motivates research into methods that reduce the human input requirements. This research investigates how metadata gathered during robotic imaging surveys, such as the location and depth information, can be used to constrain learning based on expected metadata patterns. Two self-supervised representation learning methods are developed. The first uses deep learning convolutional autoencoders that leverage location and depth information to impose soft constraints based on the assumption that images taken in physically nearby locations or similar depths are more likely to share important features than images that are taken far apart or at different depths. The second method uses contrastive learning techniques where three-dimensional position information acts as a hard constraint on representation learning. Self-supervision allows both methods to be implemented on a perdataset basis with no human input. The representations learned can be used for different downstream interpretation tasks, where applications to unsupervised clustering and representative image identification (i.e. as tasks that do not require any human input) are demonstrated alongside content based retrieval and semi-supervised learning based classification (i.e. tasks that require a relatively small amount of human input). Three real-world seafloor image datasets are analysed. These consist of ~150k seafloor images taken over 16 dives by two different Autonomous Underwater Vehicles (AUVs) along sparse and dense survey trajectories spanning a seafloor depth range of 20 to 780 metres. The results show relative accuracy gains of 7 to 15 % compared to other state of the art self-supervised representation learning and supervised learning techniques, and achieves equivalent accuracy for an order of magnitude less human input. This offers a practical solution to the problem of training deep-learning neural networks in application domains where there is limited transfer of learning across datasets.
University of Southampton
Yamada, Takaki
81c66c35-0e2b-4342-80fa-cbee6ff9ce5f
October 2021
Yamada, Takaki
81c66c35-0e2b-4342-80fa-cbee6ff9ce5f
Thornton, Blair
8293beb5-c083-47e3-b5f0-d9c3cee14be9
Yamada, Takaki
(2021)
Leveraging domain knowledge in machine learning for seafloor image interpretation.
University of Southampton, Doctoral Thesis, 155pp.
Record type:
Thesis
(Doctoral)
Abstract
This thesis develops a method to incorporate domain knowledge into modern machine learning techniques when interpreting large volumes of robotically obtained seafloor imagery. Deep learning has the potential to automate tasks such as habitat and animal recognition in marine monitoring. However, the large input of human effort needed to train the models is a bottleneck, and this motivates research into methods that reduce the human input requirements. This research investigates how metadata gathered during robotic imaging surveys, such as the location and depth information, can be used to constrain learning based on expected metadata patterns. Two self-supervised representation learning methods are developed. The first uses deep learning convolutional autoencoders that leverage location and depth information to impose soft constraints based on the assumption that images taken in physically nearby locations or similar depths are more likely to share important features than images that are taken far apart or at different depths. The second method uses contrastive learning techniques where three-dimensional position information acts as a hard constraint on representation learning. Self-supervision allows both methods to be implemented on a perdataset basis with no human input. The representations learned can be used for different downstream interpretation tasks, where applications to unsupervised clustering and representative image identification (i.e. as tasks that do not require any human input) are demonstrated alongside content based retrieval and semi-supervised learning based classification (i.e. tasks that require a relatively small amount of human input). Three real-world seafloor image datasets are analysed. These consist of ~150k seafloor images taken over 16 dives by two different Autonomous Underwater Vehicles (AUVs) along sparse and dense survey trajectories spanning a seafloor depth range of 20 to 780 metres. The results show relative accuracy gains of 7 to 15 % compared to other state of the art self-supervised representation learning and supervised learning techniques, and achieves equivalent accuracy for an order of magnitude less human input. This offers a practical solution to the problem of training deep-learning neural networks in application domains where there is limited transfer of learning across datasets.
Text
Takaki_Yamada_PhD_thesis_for_repository
- Version of Record
Text
Permission to deposit thesis - form_Takaki_Yamada
Restricted to Repository staff only
More information
Published date: October 2021
Identifiers
Local EPrints ID: 467876
URI: http://eprints.soton.ac.uk/id/eprint/467876
PURE UUID: 75b0ad68-0bb8-4f6d-97ee-92805431d0fd
Catalogue record
Date deposited: 23 Jul 2022 02:11
Last modified: 17 Mar 2024 07:18
Export record
Contributors
Author:
Takaki Yamada
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics