How seafloor image annotation reliability affects machine learning for ecological assessment

Manual image annotation is laborious and prone to human and method biases and random error, which can create uncertainty in monitoring ecosystems. Multiple annotators and machine learning can reduce individual annotation effort and, when accounted for, some of its associated error. The first study of this thesis assessed the variability and bias in two common manual annotation methods; grid-based estimation and manual segmentation; when they were used by 11 different annotators to estimate living cold-water coral density and cover. The annotation methods gave different cover estimates despite being applied to the same images. Grid-based estimation overestimated coral cover by a relative 45% and the standard deviation in cover estimates was three times that of estimates made with manual segmentation. Manual segmentation underestimated coral cover by a relative 38% due to annotators detecting but not drawing around small coral colonies. This underestimation was reduced to 15% by accounting for the variable size bias present in annotator segments using two different modelling techniques. The manual segmentations were then used to train machine learning instance segmentation models in the subsequent studies to automatically segment living coral in images from the same survey. Evaluation of trained model performances and estimates of coral metrics showed that segmentation model detection success was significantly improved by generating artificial masks for colonies that were not drawn around in training data, reducing predicted live coral density underestimation by a relative 15% to a relative overestimation of 2.7%. In the last study of this thesis, multiple annotators’ segments were combined to create new training data. Annotator segmentation agreement significantly improved when combining two and three annotators drawn segments, increasing from 40% to 53% and 67%, respectively. Agreement between drawn masks also improved, with proportion of overlap improving from 68% to 74% and 76%, respectively. The overall findings from this thesis can be used to create more robust estimates of ecological metrics and training datasets for supervised machine learning techniques, to improve seafloor image annotation at the scales needed for effective research and environmental monitoring.

University of Southampton

Curtis, Emma Juliet

e07ed097-26f2-4a6d-94d3-84be8d4c66cf

28 March 2025

Curtis, Emma Juliet

e07ed097-26f2-4a6d-94d3-84be8d4c66cf

Thornton, Blair

8293beb5-c083-47e3-b5f0-d9c3cee14be9

Durden, Jennifer M.

a65f5d1f-2009-476a-a8c6-3c32683d9eb9

Bett, Brian J.

937da613-7a28-4403-9d76-713bb8ad0046

Albrecht, James

5cbd4039-77b0-4583-aca0-088747d6d24e

Curtis, Emma Juliet (2025) How seafloor image annotation reliability affects machine learning for ecological assessment. University of Southampton, Doctoral Thesis, 221pp.

Record type: Thesis (Doctoral)