The Case for Explicit Knowledge in Documents
The Case for Explicit Knowledge in Documents
The Web is full of documents which must be interpreted by human readers and by software agents (search engines, recommender systems, clustering processes etc). Although Web standards have addressed format obfuscation by using XML schemas and stylesheets to specify unambiguous structure and presentation semantics, interpretation is still hampered by the fundamental ambiguity of information in #PCDATA text. Even the most easily distinguishable kinds of knowledge such as article citations and proper nouns (referring to people, organisations, projects, products, technical concepts) have to be identified by fallible, post-hoc extraction processes. The WiCK project has investigated the writing process in a Semantic Web environment where knowledge services exist and actively assist the author. In this paper we discuss the need to make knowledge an explicit part of the document representation and the advantages and disadvantages of this step.
90-98
Carr, Leslie
0572b10e-039d-46c6-bf05-57cce71d3936
Miles-Board, Timothy
b49521d8-0f10-4e83-adbb-b0c87a5cee99
Woukeu, Arouna
513c1f6f-03e6-4db9-a2e8-58fbbb08dc2c
Wills, Gary
3a594558-6921-4e82-8098-38cd8d4e8aa0
Hall, Wendy
11f7f8db-854c-4481-b1ae-721a51d8790c
May 2004
Carr, Leslie
0572b10e-039d-46c6-bf05-57cce71d3936
Miles-Board, Timothy
b49521d8-0f10-4e83-adbb-b0c87a5cee99
Woukeu, Arouna
513c1f6f-03e6-4db9-a2e8-58fbbb08dc2c
Wills, Gary
3a594558-6921-4e82-8098-38cd8d4e8aa0
Hall, Wendy
11f7f8db-854c-4481-b1ae-721a51d8790c
Carr, Leslie, Miles-Board, Timothy, Woukeu, Arouna, Wills, Gary and Hall, Wendy
(2004)
The Case for Explicit Knowledge in Documents.
ACM Symposium on Document Engineering, Milwaukee, Wisconsin.
28 - 30 Oct 2004.
.
Record type:
Conference or Workshop Item
(Paper)
Abstract
The Web is full of documents which must be interpreted by human readers and by software agents (search engines, recommender systems, clustering processes etc). Although Web standards have addressed format obfuscation by using XML schemas and stylesheets to specify unambiguous structure and presentation semantics, interpretation is still hampered by the fundamental ambiguity of information in #PCDATA text. Even the most easily distinguishable kinds of knowledge such as article citations and proper nouns (referring to people, organisations, projects, products, technical concepts) have to be identified by fallible, post-hoc extraction processes. The WiCK project has investigated the writing process in a Semantic Web environment where knowledge services exist and actively assist the author. In this paper we discuss the need to make knowledge an explicit part of the document representation and the advantages and disadvantages of this step.
Text
pl-04-carr.pdf
- Other
More information
Published date: May 2004
Additional Information:
Event Dates: October 28-30
Venue - Dates:
ACM Symposium on Document Engineering, Milwaukee, Wisconsin, 2004-10-28 - 2004-10-30
Organisations:
Web & Internet Science, Electronic & Software Systems
Identifiers
Local EPrints ID: 259360
URI: http://eprints.soton.ac.uk/id/eprint/259360
PURE UUID: 1fe00727-954d-483c-9637-652c4056f139
Catalogue record
Date deposited: 19 May 2004
Last modified: 15 Mar 2024 02:51
Export record
Contributors
Author:
Timothy Miles-Board
Author:
Arouna Woukeu
Author:
Gary Wills
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics