The University of Southampton
University of Southampton Institutional Repository

Predictive modeling of groundwater nitrate pollution using Random Forest and multisource variables related to intrinsic and specific vulnerability: a case study in an agricultural setting (Southern Spain)

Predictive modeling of groundwater nitrate pollution using Random Forest and multisource variables related to intrinsic and specific vulnerability: a case study in an agricultural setting (Southern Spain)
Predictive modeling of groundwater nitrate pollution using Random Forest and multisource variables related to intrinsic and specific vulnerability: a case study in an agricultural setting (Southern Spain)
Watershed management decisions need robust methods, which allow an accurate predictive modeling of pollutant occurrences. Random Forest (RF) is a powerful machine learning data driven method that is rarely used in water resources studies, and thus has not been evaluated thoroughly in this field, when compared to more conventional pattern recognition techniques key advantages of RF include: its non-parametric nature; high predictive accuracy; and capability to determine variable importance. This last characteristic can be used to better understand the individual role and the combined effect of explanatory variables in both protecting and exposing groundwater from and to a pollutant.

In this paper, the performance of the RF regression for predictive modeling of nitrate pollution is explored, based on intrinsic and specific vulnerability assessment of the Vega de Granada aquifer. The applicability of this new machine learning technique is demonstrated in an agriculture-dominated area where nitrate concentrations in groundwater can exceed the trigger value of 50 mg/L, at many locations. A comprehensive GIS database of twenty-four parameters related to intrinsic hydrogeologic proprieties, driving forces, remotely sensed variables and physical–chemical variables measured in “situ”, were used as inputs to build different predictive models of nitrate pollution. RF measures of importance were also used to define the most significant predictors of nitrate pollution in groundwater, allowing the establishment of the pollution sources (pressures).

The potential of RF for generating a vulnerability map to nitrate pollution is assessed considering multiple criteria related to variations in the algorithm parameters and the accuracy of the maps. The performance of the RF is also evaluated in comparison to the logistic regression (LR) method using different efficiency measures to ensure their generalization ability. Prediction results show the ability of RF to build accurate models with strong predictive capabilities
0048-9697
189-206
Rodriguez-Galiano, V.F.
1eb6a1dd-f73d-4e90-a9cf-a51f20712c3c
Mendes, M.P.
2ed2c148-7e6c-43ef-8cdd-ea668ed3a524
Garcia-Soldado, M.J.
218214da-02f6-458b-b591-044f935b3f05
Chica-Olmo, M.
c7291c15-3b53-45d7-942c-06985f77d6f6
Ribeiro, L.
7e6f448f-ac1d-4102-a696-5be9054bd301
Rodriguez-Galiano, V.F.
1eb6a1dd-f73d-4e90-a9cf-a51f20712c3c
Mendes, M.P.
2ed2c148-7e6c-43ef-8cdd-ea668ed3a524
Garcia-Soldado, M.J.
218214da-02f6-458b-b591-044f935b3f05
Chica-Olmo, M.
c7291c15-3b53-45d7-942c-06985f77d6f6
Ribeiro, L.
7e6f448f-ac1d-4102-a696-5be9054bd301

Rodriguez-Galiano, V.F., Mendes, M.P., Garcia-Soldado, M.J., Chica-Olmo, M. and Ribeiro, L. (2014) Predictive modeling of groundwater nitrate pollution using Random Forest and multisource variables related to intrinsic and specific vulnerability: a case study in an agricultural setting (Southern Spain). Science of the Total Environment, 476-477, 189-206. (doi:10.1016/j.scitotenv.2014.01.001).

Record type: Article

Abstract

Watershed management decisions need robust methods, which allow an accurate predictive modeling of pollutant occurrences. Random Forest (RF) is a powerful machine learning data driven method that is rarely used in water resources studies, and thus has not been evaluated thoroughly in this field, when compared to more conventional pattern recognition techniques key advantages of RF include: its non-parametric nature; high predictive accuracy; and capability to determine variable importance. This last characteristic can be used to better understand the individual role and the combined effect of explanatory variables in both protecting and exposing groundwater from and to a pollutant.

In this paper, the performance of the RF regression for predictive modeling of nitrate pollution is explored, based on intrinsic and specific vulnerability assessment of the Vega de Granada aquifer. The applicability of this new machine learning technique is demonstrated in an agriculture-dominated area where nitrate concentrations in groundwater can exceed the trigger value of 50 mg/L, at many locations. A comprehensive GIS database of twenty-four parameters related to intrinsic hydrogeologic proprieties, driving forces, remotely sensed variables and physical–chemical variables measured in “situ”, were used as inputs to build different predictive models of nitrate pollution. RF measures of importance were also used to define the most significant predictors of nitrate pollution in groundwater, allowing the establishment of the pollution sources (pressures).

The potential of RF for generating a vulnerability map to nitrate pollution is assessed considering multiple criteria related to variations in the algorithm parameters and the accuracy of the maps. The performance of the RF is also evaluated in comparison to the logistic regression (LR) method using different efficiency measures to ensure their generalization ability. Prediction results show the ability of RF to build accurate models with strong predictive capabilities

This record has no associated files available for download.

More information

Published date: 1 April 2014
Organisations: Global Env Change & Earth Observation

Identifiers

Local EPrints ID: 370079
URI: http://eprints.soton.ac.uk/id/eprint/370079
ISSN: 0048-9697
PURE UUID: 2191e30c-a5d8-4f70-8dac-3be7e9b237f0

Catalogue record

Date deposited: 23 Oct 2014 10:52
Last modified: 14 Mar 2024 18:12

Export record

Altmetrics

Contributors

Author: V.F. Rodriguez-Galiano
Author: M.P. Mendes
Author: M.J. Garcia-Soldado
Author: M. Chica-Olmo
Author: L. Ribeiro

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×