A general procedure to generate models for urban environmental-noise pollution using feature selection and machine learning methods

The prediction of environmental noise in urban environments requires the solution of a complex and non-linear problem, since there are complex relationships among the multitude of variables involved in the characterization and modelling of environmental noise and environmental-noise magnitudes. Moreover, the inclusion of the great spatial heterogeneity characteristic of urban environments seems to be essential in order to achieve an accurate environmental-noise prediction in cities. This problem is addressed in this paper, where a procedure based on feature-selection techniques and machine-learning regression methods is proposed and applied to this environmental problem. Three machine-learning regression methods, which are considered very robust in solving non-linear problems, are used to estimate the energy-equivalent sound-pressure level descriptor (LAeq). These three methods are: (i) multilayer perceptron (MLP), (ii) sequential minimal optimisation (SMO), and (iii) Gaussian processes for regression (GPR). In addition, because of the high number of input variables involved in environmental-noise modelling and estimation in urban environments, which make LAeq prediction models quite complex and costly in terms of time and resources for application to real situations, three different techniques are used to approach feature selection or data reduction. The feature-selection techniques used are: (i) correlation-based feature-subset selection (CFS), (ii) wrapper for feature-subset selection (WFS), and the data reduction technique is principal-component analysis (PCA). The subsequent analysis leads to a proposal of different schemes, depending on the needs regarding data collection and accuracy. The use of WFS as the feature-selection technique with the implementation of SMO or GPR as regression algorithm provides the best LAeq estimation (R2 = 0.94 and mean absolute error (MAE) = 1.14–1.16 dB(A)).

feature selection, multiple linear regression, multilayer perceptron, sequential minimal optimisation, gaussian processes for regression, environmental-noise prediction

10.1016/j.scitotenv.2014.08.060

0048-9697

680-693

Torija, Antonio J.

6dd0d982-fcd6-42b6-9148-211175fd3287

Ruiz, Diego P.

ab9eb00f-171c-417f-8304-5105e41cbd03

1 February 2015

Torija, Antonio J.

6dd0d982-fcd6-42b6-9148-211175fd3287

Ruiz, Diego P.

ab9eb00f-171c-417f-8304-5105e41cbd03

Torija, Antonio J. and Ruiz, Diego P. (2015) A general procedure to generate models for urban environmental-noise pollution using feature selection and machine learning methods. Science of the Total Environment, 505, 680-693. (doi:10.1016/j.scitotenv.2014.08.060).

Record type: Article

Abstract

Text

Torija_Ruiz_STOTEN_2015.pdf - Version of Record

Restricted to Repository staff only

Request a copy

More information

Accepted/In Press date: 19 August 2014

e-pub ahead of print date: 30 October 2014

Published date: 1 February 2015

Keywords: feature selection, multiple linear regression, multilayer perceptron, sequential minimal optimisation, gaussian processes for regression, environmental-noise prediction

Organisations: Acoustics Group