A general procedure to generate models for urban environmental-noise pollution using feature selection and machine learning methods
A general procedure to generate models for urban environmental-noise pollution using feature selection and machine learning methods
The prediction of environmental noise in urban environments requires the solution of a complex and non-linear problem, since there are complex relationships among the multitude of variables involved in the characterization and modelling of environmental noise and environmental-noise magnitudes. Moreover, the inclusion of the great spatial heterogeneity characteristic of urban environments seems to be essential in order to achieve an accurate environmental-noise prediction in cities. This problem is addressed in this paper, where a procedure based on feature-selection techniques and machine-learning regression methods is proposed and applied to this environmental problem. Three machine-learning regression methods, which are considered very robust in solving non-linear problems, are used to estimate the energy-equivalent sound-pressure level descriptor (LAeq). These three methods are: (i) multilayer perceptron (MLP), (ii) sequential minimal optimisation (SMO), and (iii) Gaussian processes for regression (GPR). In addition, because of the high number of input variables involved in environmental-noise modelling and estimation in urban environments, which make LAeq prediction models quite complex and costly in terms of time and resources for application to real situations, three different techniques are used to approach feature selection or data reduction. The feature-selection techniques used are: (i) correlation-based feature-subset selection (CFS), (ii) wrapper for feature-subset selection (WFS), and the data reduction technique is principal-component analysis (PCA). The subsequent analysis leads to a proposal of different schemes, depending on the needs regarding data collection and accuracy. The use of WFS as the feature-selection technique with the implementation of SMO or GPR as regression algorithm provides the best LAeq estimation (R2 = 0.94 and mean absolute error (MAE) = 1.14–1.16 dB(A)).
feature selection, multiple linear regression, multilayer perceptron, sequential minimal optimisation, gaussian processes for regression, environmental-noise prediction
680-693
Torija, Antonio J.
6dd0d982-fcd6-42b6-9148-211175fd3287
Ruiz, Diego P.
ab9eb00f-171c-417f-8304-5105e41cbd03
1 February 2015
Torija, Antonio J.
6dd0d982-fcd6-42b6-9148-211175fd3287
Ruiz, Diego P.
ab9eb00f-171c-417f-8304-5105e41cbd03
Torija, Antonio J. and Ruiz, Diego P.
(2015)
A general procedure to generate models for urban environmental-noise pollution using feature selection and machine learning methods.
Science of the Total Environment, 505, .
(doi:10.1016/j.scitotenv.2014.08.060).
Abstract
The prediction of environmental noise in urban environments requires the solution of a complex and non-linear problem, since there are complex relationships among the multitude of variables involved in the characterization and modelling of environmental noise and environmental-noise magnitudes. Moreover, the inclusion of the great spatial heterogeneity characteristic of urban environments seems to be essential in order to achieve an accurate environmental-noise prediction in cities. This problem is addressed in this paper, where a procedure based on feature-selection techniques and machine-learning regression methods is proposed and applied to this environmental problem. Three machine-learning regression methods, which are considered very robust in solving non-linear problems, are used to estimate the energy-equivalent sound-pressure level descriptor (LAeq). These three methods are: (i) multilayer perceptron (MLP), (ii) sequential minimal optimisation (SMO), and (iii) Gaussian processes for regression (GPR). In addition, because of the high number of input variables involved in environmental-noise modelling and estimation in urban environments, which make LAeq prediction models quite complex and costly in terms of time and resources for application to real situations, three different techniques are used to approach feature selection or data reduction. The feature-selection techniques used are: (i) correlation-based feature-subset selection (CFS), (ii) wrapper for feature-subset selection (WFS), and the data reduction technique is principal-component analysis (PCA). The subsequent analysis leads to a proposal of different schemes, depending on the needs regarding data collection and accuracy. The use of WFS as the feature-selection technique with the implementation of SMO or GPR as regression algorithm provides the best LAeq estimation (R2 = 0.94 and mean absolute error (MAE) = 1.14–1.16 dB(A)).
Text
Torija_Ruiz_STOTEN_2015.pdf
- Version of Record
Restricted to Repository staff only
Request a copy
More information
Accepted/In Press date: 19 August 2014
e-pub ahead of print date: 30 October 2014
Published date: 1 February 2015
Keywords:
feature selection, multiple linear regression, multilayer perceptron, sequential minimal optimisation, gaussian processes for regression, environmental-noise prediction
Organisations:
Acoustics Group
Identifiers
Local EPrints ID: 386681
URI: http://eprints.soton.ac.uk/id/eprint/386681
ISSN: 0048-9697
PURE UUID: 005f3093-a873-48b2-b135-5a06b4308366
Catalogue record
Date deposited: 03 Feb 2016 12:11
Last modified: 14 Mar 2024 22:36
Export record
Altmetrics
Contributors
Author:
Antonio J. Torija
Author:
Diego P. Ruiz
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics