EDA and a tailored data imputation algorithm for daily ozone concentrations
EDA and a tailored data imputation algorithm for daily ozone concentrations
  Air pollution is a critical environmental problem with detrimental effects on human health that is affecting all regions in the world, especially to low-income cities, where critical levels have been reached. Air pollution has a direct role in public health, climate change, and worldwide economy. Effective actions to mitigate air pollution, e.g. research and decision making, require of the availability of high resolution observations. This has motivated the emergence of new low-cost sensor technologies, which have the potential to provide high resolution data thanks to their accessible prices. However, since low-cost sensors are built with relatively low-cost materials, they tend to be unreliable. That is, measurements from low-cost sensors are prone to errors, gaps, bias and noise. All these problems need to be solved before the data can be used to support research or decision making. In this paper, we address the problem of data imputation on a daily air pollution data set with relatively small gaps. Our main contributions are: (1) an air pollution data set composed by several air pollution concentrations including criteria gases and thirteen meteorological covariates; and (2) a custom algorithm for data imputation of daily ozone concentrations based on a trend surface and a Gaussian Process. Data Visualization techniques were extensively used along this work, as they are useful tools for understanding the multi-dimensionality of point-referenced sensor data.
  Air pollution, Data imputation, Gaussian process, Sensor data
  
  372-386
  
  
    
      Gualán, Ronald
      
        5d6e9dc1-0512-4f28-8d6c-8c07d681455b
      
     
  
    
      Gualán, Ronald
      
        5d6e9dc1-0512-4f28-8d6c-8c07d681455b
      
     
  
    
      Saquicela, Víctor
      
        c8d485f4-a61e-405b-a31e-4259bf5bed0b
      
     
  
    
      Tran-Thanh, Long
      
        e0666669-d34b-460e-950d-e8b139fab16c
      
     
  
  
    
  
    
  
    
  
    
  
    
  
    
  
   
  
  
    
    
  
    
      1 January 2019
    
    
  
  
    
      Gualán, Ronald
      
        5d6e9dc1-0512-4f28-8d6c-8c07d681455b
      
     
  
    
      Gualán, Ronald
      
        5d6e9dc1-0512-4f28-8d6c-8c07d681455b
      
     
  
    
      Saquicela, Víctor
      
        c8d485f4-a61e-405b-a31e-4259bf5bed0b
      
     
  
    
      Tran-Thanh, Long
      
        e0666669-d34b-460e-950d-e8b139fab16c
      
     
  
    
  
    
  
    
  
    
  
    
  
    
  
       
    
 
  
    
      
  
  
  
  
    Gualán, Ronald, Gualán, Ronald, Saquicela, Víctor and Tran-Thanh, Long
  
  
  
  
   
    (2019)
  
  
    
    EDA and a tailored data imputation algorithm for daily ozone concentrations.
  
  
  
    
      Botto-Tobar, M., Barba-Maggi, L., Gonzalez-Huerta, J., Villacres-Cevallos, P., Gomez, O.S. and Uvidia-Fassler, M. 
      (eds.)
    
  
  
   In Information and Communication Technologies of Ecuador (TIC.EC) : TICEC 2018. 
  vol. 884, 
      Springer. 
          
          
        .
    
  
  
  
   (doi:10.1007/978-3-030-02828-2_27).
  
   
  
    
      Record type:
      Conference or Workshop Item
      (Paper)
      
      
    
   
    
      
        
          Abstract
          Air pollution is a critical environmental problem with detrimental effects on human health that is affecting all regions in the world, especially to low-income cities, where critical levels have been reached. Air pollution has a direct role in public health, climate change, and worldwide economy. Effective actions to mitigate air pollution, e.g. research and decision making, require of the availability of high resolution observations. This has motivated the emergence of new low-cost sensor technologies, which have the potential to provide high resolution data thanks to their accessible prices. However, since low-cost sensors are built with relatively low-cost materials, they tend to be unreliable. That is, measurements from low-cost sensors are prone to errors, gaps, bias and noise. All these problems need to be solved before the data can be used to support research or decision making. In this paper, we address the problem of data imputation on a daily air pollution data set with relatively small gaps. Our main contributions are: (1) an air pollution data set composed by several air pollution concentrations including criteria gases and thirteen meteorological covariates; and (2) a custom algorithm for data imputation of daily ozone concentrations based on a trend surface and a Gaussian Process. Data Visualization techniques were extensively used along this work, as they are useful tools for understanding the multi-dimensionality of point-referenced sensor data.
        
        This record has no associated files available for download.
       
    
    
   
  
  
  More information
  
    
      e-pub ahead of print date: 18 October 2018
 
    
      Published date: 1 January 2019
 
    
  
  
    
  
    
  
    
  
    
  
    
  
    
     
        Keywords:
        Air pollution, Data imputation, Gaussian process, Sensor data
      
    
  
    
  
    
  
  
        Identifiers
        Local EPrints ID: 428997
        URI: http://eprints.soton.ac.uk/id/eprint/428997
        
          
        
        
        
        
          PURE UUID: cb9dcde5-754c-41db-a1c1-58feb7b806ce
        
  
    
        
          
        
    
        
          
        
    
        
          
        
    
        
          
            
              
            
          
        
    
        
          
            
          
        
    
        
          
            
          
        
    
        
          
            
          
        
    
        
          
            
          
        
    
        
          
            
          
        
    
        
          
            
          
        
    
  
  Catalogue record
  Date deposited: 15 Mar 2019 17:30
  Last modified: 15 Mar 2024 22:41
  Export record
  
  
   Altmetrics
   
   
  
 
 
  
    
    
      Contributors
      
          
          Author:
          
            
            
              Ronald Gualán
            
          
        
      
          
          Author:
          
            
            
              Ronald Gualán
            
          
        
      
          
          Author:
          
            
            
              Víctor Saquicela
            
          
        
      
          
          Author:
          
            
              
              
                Long Tran-Thanh
              
              
                
              
            
            
          
         
      
          
          Editor:
          
            
              
              
                M. Botto-Tobar
              
              
            
            
          
        
      
          
          Editor:
          
            
              
              
                L. Barba-Maggi
              
              
            
            
          
        
      
          
          Editor:
          
            
              
              
                J. Gonzalez-Huerta
              
              
            
            
          
        
      
          
          Editor:
          
            
              
              
                P. Villacres-Cevallos
              
              
            
            
          
        
      
          
          Editor:
          
            
              
              
                O.S. Gomez
              
              
            
            
          
        
      
          
          Editor:
          
            
              
              
                M. Uvidia-Fassler
              
              
            
            
          
        
      
      
      
    
  
   
  
    Download statistics
    
      Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
      
      View more statistics