The current and future uses of machine learning in ecosystem service research
The current and future uses of machine learning in ecosystem service research
Machine learning (ML) expands traditional data analysis and presents a range of opportunities in ecosystem service (ES) research, offering rapid processing of ‘big data’ and enabling significant advances in data description and predictive modelling. Descriptive ML techniques group data with little or no prior domain specific assumptions; they can generate hypotheses and automatically sort data prior to other analyses. Predictive ML techniques allow for the predictive modelling of highly non-linear systems where casual mechanisms are poorly understood, as is often the case for ES. We conducted a review to explore how ML is used in ES research and to identify and quantify trends in the different ML approaches that are used. We reviewed 308 peer-reviewed publications and identified that ES studies implemented machine learning techniques in data description (64%; n = 308) and predictive modelling (44%), with some papers containing both categories. Classification and Regression Trees were the most popular techniques (60%), but unsupervised learning techniques were also used for descriptive tasks such as clustering to group or split data without prior assumptions (19%). Whilst there are examples of ES publications that apply ML with rigour, many studies do not have robust or repeatable methods. Some studies fail to report model settings (43%) or software used (28%), and many studies do not report carrying out any form of model hyperparameter tuning (67%) or test model generalisability (59%). Whilst studies use ML to analyse very large and complex datasets, ES research is generally not taking full advantage of the capacity of ML to model big data (1138 medium number of data points; 13 median quantity of variables). There is great further opportunity to utilise ML in ES research, to make better use of big data and to develop detailed modelling of spatial-temporal dynamics that meet stakeholder demands.
Scowen, Matthew
85efda7f-d3d5-4327-9463-2ff1127cb283
Athanasiadis, Ioannis N.
877a8a4a-d1db-4f2c-bd4f-6c9e53f62700
Bullock, James M.
5f66ee49-9e8a-47d6-a799-6c89bf7c9705
Eigenbrod, Felix
43efc6ae-b129-45a2-8a34-e489b5f05827
Willcock, Simon
99d42450-83e7-477a-82a9-b32835258c96
10 August 2021
Scowen, Matthew
85efda7f-d3d5-4327-9463-2ff1127cb283
Athanasiadis, Ioannis N.
877a8a4a-d1db-4f2c-bd4f-6c9e53f62700
Bullock, James M.
5f66ee49-9e8a-47d6-a799-6c89bf7c9705
Eigenbrod, Felix
43efc6ae-b129-45a2-8a34-e489b5f05827
Willcock, Simon
99d42450-83e7-477a-82a9-b32835258c96
Scowen, Matthew, Athanasiadis, Ioannis N., Bullock, James M., Eigenbrod, Felix and Willcock, Simon
(2021)
The current and future uses of machine learning in ecosystem service research.
Science of the Total Environment, 799.
(doi:10.1016/j.scitotenv.2021.149263).
Abstract
Machine learning (ML) expands traditional data analysis and presents a range of opportunities in ecosystem service (ES) research, offering rapid processing of ‘big data’ and enabling significant advances in data description and predictive modelling. Descriptive ML techniques group data with little or no prior domain specific assumptions; they can generate hypotheses and automatically sort data prior to other analyses. Predictive ML techniques allow for the predictive modelling of highly non-linear systems where casual mechanisms are poorly understood, as is often the case for ES. We conducted a review to explore how ML is used in ES research and to identify and quantify trends in the different ML approaches that are used. We reviewed 308 peer-reviewed publications and identified that ES studies implemented machine learning techniques in data description (64%; n = 308) and predictive modelling (44%), with some papers containing both categories. Classification and Regression Trees were the most popular techniques (60%), but unsupervised learning techniques were also used for descriptive tasks such as clustering to group or split data without prior assumptions (19%). Whilst there are examples of ES publications that apply ML with rigour, many studies do not have robust or repeatable methods. Some studies fail to report model settings (43%) or software used (28%), and many studies do not report carrying out any form of model hyperparameter tuning (67%) or test model generalisability (59%). Whilst studies use ML to analyse very large and complex datasets, ES research is generally not taking full advantage of the capacity of ML to model big data (1138 medium number of data points; 13 median quantity of variables). There is great further opportunity to utilise ML in ES research, to make better use of big data and to develop detailed modelling of spatial-temporal dynamics that meet stakeholder demands.
Text
1-s2.0-S0048969721043369-main
- Version of Record
More information
e-pub ahead of print date: 27 July 2021
Published date: 10 August 2021
Identifiers
Local EPrints ID: 490537
URI: http://eprints.soton.ac.uk/id/eprint/490537
ISSN: 0048-9697
PURE UUID: 0faff63e-147f-4a2a-9146-a31e1d646ee6
Catalogue record
Date deposited: 30 May 2024 16:30
Last modified: 31 May 2024 01:43
Export record
Altmetrics
Contributors
Author:
Matthew Scowen
Author:
Ioannis N. Athanasiadis
Author:
James M. Bullock
Author:
Simon Willcock
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics