Capturing interactive data transformation operations using provenance workflows
Capturing interactive data transformation operations using provenance workflows
 
  The ready availability of data is leading to the increased opportunity of their re-use for new applications and for analyses. Most of these data are not necessarily in the format users want, are usually heterogeneous, and highly dynamic, and this necessitates data transformation efforts to re-purpose them. Interactive data transformation (IDT) tools are becoming easily available to lower these barriers to data trans- formation efforts. This paper describes a principled way to capture data lineage of interactive data transformation processes. We provide a formal model of IDT, its mapping to a provenance representation, and its implementation and validation on Google Refine. Provision of the data transformation process sequences allows assessment of data quality and ensures portability between IDT and other data transformation platforms. The proposed model showed a high level of coverage against a set of requirements used for evaluating systems that provide provenance management solutions.
  
    
      Omitola, Tope
      
        35ba4e4d-beec-4643-a152-995f8979867a
      
     
  
    
      Freitas, Andre
      
        c7a66eef-8f9d-4006-9d6c-cc75e6d6fe19
      
     
  
    
      Edward, Curry
      
        5f6a85f5-e499-42de-8ce8-386d5ee95a1d
      
     
  
    
      O’Riain, Sean
      
        f89998be-8eec-4ab2-8d5f-fd04f1197ee0
      
     
  
    
      Gibbins, Nicholas
      
        98efd447-4aa7-411c-86d1-955a612eceac
      
     
  
    
      Shadbolt, Nigel
      
        5c5acdf4-ad42-49b6-81fe-e9db58c2caf7
      
     
  
  
   
  
  
    
      May 2012
    
    
  
  
    
      Omitola, Tope
      
        35ba4e4d-beec-4643-a152-995f8979867a
      
     
  
    
      Freitas, Andre
      
        c7a66eef-8f9d-4006-9d6c-cc75e6d6fe19
      
     
  
    
      Edward, Curry
      
        5f6a85f5-e499-42de-8ce8-386d5ee95a1d
      
     
  
    
      O’Riain, Sean
      
        f89998be-8eec-4ab2-8d5f-fd04f1197ee0
      
     
  
    
      Gibbins, Nicholas
      
        98efd447-4aa7-411c-86d1-955a612eceac
      
     
  
    
      Shadbolt, Nigel
      
        5c5acdf4-ad42-49b6-81fe-e9db58c2caf7
      
     
  
       
    
 
  
    
      
  
  
  
  
    Omitola, Tope, Freitas, Andre, Edward, Curry, O’Riain, Sean, Gibbins, Nicholas and Shadbolt, Nigel
  
  
  
  
   
    (2012)
  
  
    
    Capturing interactive data transformation operations using provenance workflows.
  
  
  
  
    
    
    
      
        
   
  
    The Third International Workshop on the Role of Semantic Web in Provenance Management (SWPM 2012), Heraklion, Greece.
   
        
        
        27 - 28  May 2012.
      
    
  
  
  
      
          
           12 pp
        .
    
  
  
  
  
  
   
  
    
      Record type:
      Conference or Workshop Item
      (Paper)
      
      
    
   
    
    
      
        
          Abstract
          The ready availability of data is leading to the increased opportunity of their re-use for new applications and for analyses. Most of these data are not necessarily in the format users want, are usually heterogeneous, and highly dynamic, and this necessitates data transformation efforts to re-purpose them. Interactive data transformation (IDT) tools are becoming easily available to lower these barriers to data trans- formation efforts. This paper describes a principled way to capture data lineage of interactive data transformation processes. We provide a formal model of IDT, its mapping to a provenance representation, and its implementation and validation on Google Refine. Provision of the data transformation process sequences allows assessment of data quality and ensures portability between IDT and other data transformation platforms. The proposed model showed a high level of coverage against a set of requirements used for evaluating systems that provide provenance management solutions.
         
      
      
        
          
            
  
    Text
 Omitola_Eswc2012_Provenance_Wkshop.pdf
     - Author's Original
   
  
  
 
          
            
          
            
           
            
           
        
        
       
    
   
  
  
  More information
  
    
      Published date: May 2012
 
    
  
  
    
  
    
  
    
     
        Venue - Dates:
        The Third International Workshop on the Role of Semantic Web in Provenance Management (SWPM 2012), Heraklion, Greece, 2012-05-27 - 2012-05-28
      
    
  
    
  
    
     
    
  
    
  
    
     
        Organisations:
        Electronics & Computer Science
      
    
  
    
  
  
        Identifiers
        Local EPrints ID: 336970
        URI: http://eprints.soton.ac.uk/id/eprint/336970
        
        
        
        
          PURE UUID: b27827e6-671e-4140-99c8-95b118fe1062
        
  
    
        
          
        
    
        
          
        
    
        
          
        
    
        
          
        
    
        
          
            
              
            
          
        
    
        
          
            
          
        
    
  
  Catalogue record
  Date deposited: 12 Apr 2012 14:20
  Last modified: 15 Mar 2024 03:00
  Export record
  
  
 
 
  
    
    
      Contributors
      
          
          Author:
          
            
            
              Tope Omitola
            
          
        
      
          
          Author:
          
            
            
              Andre Freitas
            
          
        
      
          
          Author:
          
            
            
              Curry Edward
            
          
        
      
          
          Author:
          
            
            
              Sean O’Riain
            
          
        
      
          
          Author:
          
            
              
              
                Nicholas Gibbins
              
              
                 
              
            
            
          
         
      
          
          Author:
          
            
              
              
                Nigel Shadbolt
              
              
            
            
          
        
      
      
      
    
  
   
  
    Download statistics
    
      Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
      
      View more statistics