A comparison of statistical methods for deriving occupancy estimates from machine learning outputs
A comparison of statistical methods for deriving occupancy estimates from machine learning outputs
The combination of autonomous recording units (ARUs) and machine learning enables scalable biodiversity monitoring. These data are often analysed using occupancy models, yet methods for integrating machine learning outputs with these models are rarely compared. Using the Yucatán black howler monkey as a case study, we evaluated four approaches for integrating ARU data and machine learning outputs into occupancy models: (i) standard occupancy models with verified data, and false-positive occupancy models using (ii) presence-absence data, (iii) counts of detections, and (iv) continuous classifier scores. We assessed estimator accuracy and the effects of decision threshold, temporal subsampling, and verification strategies. We found that classifier-guided listening with a standard occupancy model provided an accurate estimate with minimal verification effort. The false-positive models yielded similarly accurate estimates under specific conditions, but were sensitive to subjective choices including decision threshold. The inability to determine stable parameter choices a priori, coupled with the increased computational complexity of several models (i.e. the detection-count and continuous-score models), limits the practical application of false-positive models. In the case of a high-performance classifier and a readily detectable species, classifier-guided listening paired with a standard occupancy model provides a practical and efficient approach for accurately estimating occupancy.
Acoustic monitoring, Autonomous recording units (ARUs), Biodiversity monitoring, False-positive models, Occupancy modelling, Yucatán black howler monkey
Katsis, Lydia K. D.
a90d89d0-22f0-47fd-94a6-bb7f2d9614cf
Rhinehart, Tessa a.
2d84d050-251f-46b7-b852-afbb48cb0507
Dorgay, Elizabeth
a0dfd75f-5323-4b57-bf2c-7e79daf7077e
Sanchez, Emma e.
8a93d885-94f8-4c36-af8a-6f7e902494e1
Snaddon, Jake l.
31a601f7-c9b0-45e2-b59b-fda9a0c5a54b
Doncaster, C. Patrick
0eff2f42-fa0a-4e35-b6ac-475ad3482047
Kitzes, Justin
ef5b2b2a-4b3d-44ca-b93b-e998f734087a
27 April 2025
Katsis, Lydia K. D.
a90d89d0-22f0-47fd-94a6-bb7f2d9614cf
Rhinehart, Tessa a.
2d84d050-251f-46b7-b852-afbb48cb0507
Dorgay, Elizabeth
a0dfd75f-5323-4b57-bf2c-7e79daf7077e
Sanchez, Emma e.
8a93d885-94f8-4c36-af8a-6f7e902494e1
Snaddon, Jake l.
31a601f7-c9b0-45e2-b59b-fda9a0c5a54b
Doncaster, C. Patrick
0eff2f42-fa0a-4e35-b6ac-475ad3482047
Kitzes, Justin
ef5b2b2a-4b3d-44ca-b93b-e998f734087a
Katsis, Lydia K. D., Rhinehart, Tessa a., Dorgay, Elizabeth, Sanchez, Emma e., Snaddon, Jake l., Doncaster, C. Patrick and Kitzes, Justin
(2025)
A comparison of statistical methods for deriving occupancy estimates from machine learning outputs.
Scientific Reports, 15 (1), [14700].
(doi:10.1038/s41598-025-95207-3).
Abstract
The combination of autonomous recording units (ARUs) and machine learning enables scalable biodiversity monitoring. These data are often analysed using occupancy models, yet methods for integrating machine learning outputs with these models are rarely compared. Using the Yucatán black howler monkey as a case study, we evaluated four approaches for integrating ARU data and machine learning outputs into occupancy models: (i) standard occupancy models with verified data, and false-positive occupancy models using (ii) presence-absence data, (iii) counts of detections, and (iv) continuous classifier scores. We assessed estimator accuracy and the effects of decision threshold, temporal subsampling, and verification strategies. We found that classifier-guided listening with a standard occupancy model provided an accurate estimate with minimal verification effort. The false-positive models yielded similarly accurate estimates under specific conditions, but were sensitive to subjective choices including decision threshold. The inability to determine stable parameter choices a priori, coupled with the increased computational complexity of several models (i.e. the detection-count and continuous-score models), limits the practical application of false-positive models. In the case of a high-performance classifier and a readily detectable species, classifier-guided listening paired with a standard occupancy model provides a practical and efficient approach for accurately estimating occupancy.
Text
s41598-025-95207-3
- Version of Record
More information
Accepted/In Press date: 19 March 2025
Published date: 27 April 2025
Additional Information:
Publisher Copyright:
© The Author(s) 2025.
Keywords:
Acoustic monitoring, Autonomous recording units (ARUs), Biodiversity monitoring, False-positive models, Occupancy modelling, Yucatán black howler monkey
Identifiers
Local EPrints ID: 502035
URI: http://eprints.soton.ac.uk/id/eprint/502035
ISSN: 2045-2322
PURE UUID: 49ae0a04-e219-4414-b6da-ccd1b211615a
Catalogue record
Date deposited: 13 Jun 2025 17:23
Last modified: 22 Aug 2025 01:39
Export record
Altmetrics
Contributors
Author:
Tessa a. Rhinehart
Author:
Elizabeth Dorgay
Author:
Emma e. Sanchez
Author:
Justin Kitzes
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics