Leveraging domain knowledge in machine learning for seafloor image interpretation

This thesis develops a method to incorporate domain knowledge into modern machine learning techniques when interpreting large volumes of robotically obtained seafloor imagery. Deep learning has the potential to automate tasks such as habitat and animal recognition in marine monitoring. However, the large input of human effort needed to train the models is a bottleneck, and this motivates research into methods that reduce the human input requirements. This research investigates how metadata gathered during robotic imaging surveys, such as the location and depth information, can be used to constrain learning based on expected metadata patterns. Two self-supervised representation learning methods are developed. The first uses deep learning convolutional autoencoders that leverage location and depth information to impose soft constraints based on the assumption that images taken in physically nearby locations or similar depths are more likely to share important features than images that are taken far apart or at different depths. The second method uses contrastive learning techniques where three-dimensional position information acts as a hard constraint on representation learning. Self-supervision allows both methods to be implemented on a perdataset basis with no human input. The representations learned can be used for different downstream interpretation tasks, where applications to unsupervised clustering and representative image identification (i.e. as tasks that do not require any human input) are demonstrated alongside content based retrieval and semi-supervised learning based classification (i.e. tasks that require a relatively small amount of human input). Three real-world seafloor image datasets are analysed. These consist of ~150k seafloor images taken over 16 dives by two different Autonomous Underwater Vehicles (AUVs) along sparse and dense survey trajectories spanning a seafloor depth range of 20 to 780 metres. The results show relative accuracy gains of 7 to 15 % compared to other state of the art self-supervised representation learning and supervised learning techniques, and achieves equivalent accuracy for an order of magnitude less human input. This offers a practical solution to the problem of training deep-learning neural networks in application domains where there is limited transfer of learning across datasets.

University of Southampton

Yamada, Takaki

81c66c35-0e2b-4342-80fa-cbee6ff9ce5f

October 2021

Yamada, Takaki

81c66c35-0e2b-4342-80fa-cbee6ff9ce5f

Thornton, Blair

8293beb5-c083-47e3-b5f0-d9c3cee14be9

Yamada, Takaki (2021) Leveraging domain knowledge in machine learning for seafloor image interpretation. University of Southampton, Doctoral Thesis, 155pp.

Record type: Thesis (Doctoral)