Automatic Ontology-Based Knowledge Extraction from Web Documents
Automatic Ontology-Based Knowledge Extraction from Web Documents
 
  To bring the Semantic Web to life and provide advanced knowledge services, we need efficient ways to access and extract knowledge from Web documents. Although Web page annotations could facilitate such knowledge gathering, annotations are rare and will probably never be rich or detailed enough to cover all the knowledge these documents contain. Manual annotation is impractical and unscalable, and automatic annotation tools remain largely undeveloped. Specialized knowledge services therefore require tools that can search and extract specific knowledge directly from unstructured text on the Web, guided by an ontology that details what type of knowledge to harvest. An ontology uses concepts and relations to classify domain knowledge. Other researchers have used ontologies to support knowledge extraction,1,2 but few have explored their full potential in this domain. The Artequakt project links a knowledge-extraction tool with an ontology to achieve continuous knowledge support and guide information extraction. The extraction tool searches online documents and extracts knowledge that matches the given classification structure. It provides this knowledge in a machine-readable format that will be automatically maintained in a knowledge base (KB). Users could further enhance knowledge extraction using a lexicon-based term expansion mechanism that provides extended ontology terminology.
  
  14-21
  
    
      Alani, Harith
      
        70cdbdce-1494-44c2-9dae-65d82bf7e991
      
     
  
    
      Kim, Sanghee
      
        9e0e5909-9fbe-4c37-9606-2fdea35eac12
      
     
  
    
      Millard, David E.
      
        4f19bca5-80dc-4533-a101-89a5a0e3b372
      
     
  
    
      Weal, Mark J.
      
        e8fd30a6-c060-41c5-b388-ca52c81032a4
      
     
  
    
      Hall, Wendy
      
        11f7f8db-854c-4481-b1ae-721a51d8790c
      
     
  
    
      Lewis, Paul H.
      
        7aa6c6d9-bc69-4e19-b2ac-a6e20558c020
      
     
  
    
      Shadbolt, Nigel R.
      
        5c5acdf4-ad42-49b6-81fe-e9db58c2caf7
      
     
  
  
   
  
  
    
      January 2003
    
    
  
  
    
      Alani, Harith
      
        70cdbdce-1494-44c2-9dae-65d82bf7e991
      
     
  
    
      Kim, Sanghee
      
        9e0e5909-9fbe-4c37-9606-2fdea35eac12
      
     
  
    
      Millard, David E.
      
        4f19bca5-80dc-4533-a101-89a5a0e3b372
      
     
  
    
      Weal, Mark J.
      
        e8fd30a6-c060-41c5-b388-ca52c81032a4
      
     
  
    
      Hall, Wendy
      
        11f7f8db-854c-4481-b1ae-721a51d8790c
      
     
  
    
      Lewis, Paul H.
      
        7aa6c6d9-bc69-4e19-b2ac-a6e20558c020
      
     
  
    
      Shadbolt, Nigel R.
      
        5c5acdf4-ad42-49b6-81fe-e9db58c2caf7
      
     
  
       
    
 
  
    
      
  
  
  
  
  
  
    Alani, Harith, Kim, Sanghee, Millard, David E., Weal, Mark J., Hall, Wendy, Lewis, Paul H. and Shadbolt, Nigel R.
  
  
  
  
   
    (2003)
  
  
    
    Automatic Ontology-Based Knowledge Extraction from Web Documents.
  
  
  
  
    IEEE Intelligent Systems, 18 (1), .
  
   
  
  
   
  
  
  
  
  
   
  
    
    
      
        
          Abstract
          To bring the Semantic Web to life and provide advanced knowledge services, we need efficient ways to access and extract knowledge from Web documents. Although Web page annotations could facilitate such knowledge gathering, annotations are rare and will probably never be rich or detailed enough to cover all the knowledge these documents contain. Manual annotation is impractical and unscalable, and automatic annotation tools remain largely undeveloped. Specialized knowledge services therefore require tools that can search and extract specific knowledge directly from unstructured text on the Web, guided by an ontology that details what type of knowledge to harvest. An ontology uses concepts and relations to classify domain knowledge. Other researchers have used ontologies to support knowledge extraction,1,2 but few have explored their full potential in this domain. The Artequakt project links a knowledge-extraction tool with an ontology to achieve continuous knowledge support and guide information extraction. The extraction tool searches online documents and extracts knowledge that matches the given classification structure. It provides this knowledge in a machine-readable format that will be automatically maintained in a knowledge base (KB). Users could further enhance knowledge extraction using a lexicon-based term expansion mechanism that provides extended ontology terminology.
         
      
      
        
          
            
  
    Text
 Alani-IEEE-IS-2002.pdf
     - Other
   
  
  
 
          
            
          
            
           
            
           
        
        
       
    
   
  
  
  More information
  
    
      Published date: January 2003
 
    
  
  
    
  
    
  
    
  
    
  
    
  
    
  
    
     
        Organisations:
        Web & Internet Science
      
    
  
    
  
  
  
    
  
  
        Identifiers
        Local EPrints ID: 257396
        URI: http://eprints.soton.ac.uk/id/eprint/257396
        
        
        
          ISSN: 1541-1672
        
        
          PURE UUID: df4550ad-5d12-4ae0-ac5c-dbb7d77918ef
        
  
    
        
          
        
    
        
          
        
    
        
          
            
              
            
          
        
    
        
          
            
              
            
          
        
    
        
          
            
              
            
          
        
    
        
          
            
          
        
    
        
          
            
          
        
    
  
  Catalogue record
  Date deposited: 14 Apr 2003
  Last modified: 15 Mar 2024 02:58
  Export record
  
  
 
 
  
    
    
      Contributors
      
          
          Author:
          
            
            
              Harith Alani
            
          
        
      
          
          Author:
          
            
            
              Sanghee Kim
            
          
        
      
          
          Author:
          
            
              
              
                David E. Millard
              
              
                 
              
            
            
          
         
      
          
          Author:
          
            
              
              
                Mark J. Weal
              
              
                 
              
            
            
          
         
      
        
      
          
          Author:
          
            
              
              
                Paul H. Lewis
              
              
            
            
          
        
      
          
          Author:
          
            
              
              
                Nigel R. Shadbolt
              
              
            
            
          
        
      
      
      
    
  
   
  
    Download statistics
    
      Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
      
      View more statistics