Semantics-Based Content Extraction in Typewritten Historical Documents
Antonacopoulos, Apostolos and Karatzas, Dimosthenis (2005) Semantics-Based Content Extraction in Typewritten Historical Documents. In, 8th International Conference on Document Analysis and Recognition (ICDAR2005), Seoul, Korea, 29 - 01 Aug 2005. IEEE-CS Press, 48-53.
Download
|
PDF
Download (523Kb) |
Description/Abstract
This paper presents a flexible approach to extracting content from scanned historical documents using semantic information. The final electronic document is the result of a "digital historical document lifecycle" process, where the expert knowledge of the historian/archivist user is incorporated at different stages. Results show that such a conversion strategy aided by (expert) user-specified semantic information and which enables the processing of individual parts of the document in a specialised way, produces superior (in a variety of significant ways) results than document analysis and understanding techniques devised for contemporary documents.
| Item Type: | Conference or Workshop Item (Paper) |
|---|---|
| Additional Information: | Event Dates: August 29 - Semptember 1, 2005 |
| Keywords: | Historical documents, digital libraries, text enchancement, image analysis |
| Divisions: | Faculty of Physical and Applied Science > Electronics and Computer Science |
| Item ID: | 263542 |
| Date Deposited: | 19 Feb 2007 |
| Last Modified: | 02 Mar 2012 12:20 |
| Contributors: | Antonacopoulos, Apostolos (Author) Karatzas, Dimosthenis (Author) |
| Date: | 2005 |
| Additional Information: | Event Dates: August 29 - Semptember 1, 2005 |
| Status: | Published |
| Publisher: | IEEE-CS Press |
| Further Information: | Google Scholar |
| ISI Citation Count: | 4 |
| URI: | http://eprints.soton.ac.uk/id/eprint/263542 |
Actions (login required)
![]() |
View Item |


