EDA and a tailored data imputation algorithm for daily ozone concentrations
EDA and a tailored data imputation algorithm for daily ozone concentrations
Air pollution is a critical environmental problem with detrimental effects on human health that is affecting all regions in the world, especially to low-income cities, where critical levels have been reached. Air pollution has a direct role in public health, climate change, and worldwide economy. Effective actions to mitigate air pollution, e.g. research and decision making, require of the availability of high resolution observations. This has motivated the emergence of new low-cost sensor technologies, which have the potential to provide high resolution data thanks to their accessible prices. However, since low-cost sensors are built with relatively low-cost materials, they tend to be unreliable. That is, measurements from low-cost sensors are prone to errors, gaps, bias and noise. All these problems need to be solved before the data can be used to support research or decision making. In this paper, we address the problem of data imputation on a daily air pollution data set with relatively small gaps. Our main contributions are: (1) an air pollution data set composed by several air pollution concentrations including criteria gases and thirteen meteorological covariates; and (2) a custom algorithm for data imputation of daily ozone concentrations based on a trend surface and a Gaussian Process. Data Visualization techniques were extensively used along this work, as they are useful tools for understanding the multi-dimensionality of point-referenced sensor data.
Air pollution, Data imputation, Gaussian process, Sensor data
372-386
Gualán, Ronald
5d6e9dc1-0512-4f28-8d6c-8c07d681455b
Gualán, Ronald
5d6e9dc1-0512-4f28-8d6c-8c07d681455b
Saquicela, Víctor
c8d485f4-a61e-405b-a31e-4259bf5bed0b
Tran-Thanh, Long
e0666669-d34b-460e-950d-e8b139fab16c
1 January 2019
Gualán, Ronald
5d6e9dc1-0512-4f28-8d6c-8c07d681455b
Gualán, Ronald
5d6e9dc1-0512-4f28-8d6c-8c07d681455b
Saquicela, Víctor
c8d485f4-a61e-405b-a31e-4259bf5bed0b
Tran-Thanh, Long
e0666669-d34b-460e-950d-e8b139fab16c
Gualán, Ronald, Gualán, Ronald, Saquicela, Víctor and Tran-Thanh, Long
(2019)
EDA and a tailored data imputation algorithm for daily ozone concentrations.
Botto-Tobar, M., Barba-Maggi, L., Gonzalez-Huerta, J., Villacres-Cevallos, P., Gomez, O.S. and Uvidia-Fassler, M.
(eds.)
In Information and Communication Technologies of Ecuador (TIC.EC) : TICEC 2018.
vol. 884,
Springer.
.
(doi:10.1007/978-3-030-02828-2_27).
Record type:
Conference or Workshop Item
(Paper)
Abstract
Air pollution is a critical environmental problem with detrimental effects on human health that is affecting all regions in the world, especially to low-income cities, where critical levels have been reached. Air pollution has a direct role in public health, climate change, and worldwide economy. Effective actions to mitigate air pollution, e.g. research and decision making, require of the availability of high resolution observations. This has motivated the emergence of new low-cost sensor technologies, which have the potential to provide high resolution data thanks to their accessible prices. However, since low-cost sensors are built with relatively low-cost materials, they tend to be unreliable. That is, measurements from low-cost sensors are prone to errors, gaps, bias and noise. All these problems need to be solved before the data can be used to support research or decision making. In this paper, we address the problem of data imputation on a daily air pollution data set with relatively small gaps. Our main contributions are: (1) an air pollution data set composed by several air pollution concentrations including criteria gases and thirteen meteorological covariates; and (2) a custom algorithm for data imputation of daily ozone concentrations based on a trend surface and a Gaussian Process. Data Visualization techniques were extensively used along this work, as they are useful tools for understanding the multi-dimensionality of point-referenced sensor data.
This record has no associated files available for download.
More information
e-pub ahead of print date: 18 October 2018
Published date: 1 January 2019
Keywords:
Air pollution, Data imputation, Gaussian process, Sensor data
Identifiers
Local EPrints ID: 428997
URI: http://eprints.soton.ac.uk/id/eprint/428997
PURE UUID: cb9dcde5-754c-41db-a1c1-58feb7b806ce
Catalogue record
Date deposited: 15 Mar 2019 17:30
Last modified: 15 Mar 2024 22:41
Export record
Altmetrics
Contributors
Author:
Ronald Gualán
Author:
Ronald Gualán
Author:
Víctor Saquicela
Author:
Long Tran-Thanh
Editor:
M. Botto-Tobar
Editor:
L. Barba-Maggi
Editor:
J. Gonzalez-Huerta
Editor:
P. Villacres-Cevallos
Editor:
O.S. Gomez
Editor:
M. Uvidia-Fassler
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics