The University of Southampton
University of Southampton Institutional Repository

Ground Truth for Layout Analysis Performance Evaluation

Ground Truth for Layout Analysis Performance Evaluation
Ground Truth for Layout Analysis Performance Evaluation
Over the past two decades a significant number of layout analysis (page segmentation and region classification) approaches have been proposed in the literature. Each approach has been devised for and/or evaluated using (usually small) application-specific datasets. While the need for objective performance evaluation of layout analysis algorithms is evident, there does not exist a suitable dataset with ground truth that reflects the realities of everyday documents (widely varying layouts, complex entities, colour, noise etc.). The most significant impediment is the creation of accurate and flexible (in representation) ground truth, a task that is costly and must be carefully designed. This paper discusses the issues related to the design, representation and creation of ground truth in the context of a realistic dataset developed by the authors. The effectiveness of the ground truth discussed in this paper has been successfully shown in its use for two international page segmentation competitions (ICDAR2003 and ICDAR2005).
ground truth, layout analysis, colour document analysis, performance evaluation
302-311
Antonacopoulos, Apostolos
9369bee5-b30f-4d4c-a63d-fe54984578cc
Karatzas, Dimosthenis
4d7e3927-2252-4039-88a4-0daca766e943
Bridson, David
9728123d-b09b-4f04-bd3e-16341d8bf090
Bunke, Horst
Spitz, A.L.
Antonacopoulos, Apostolos
9369bee5-b30f-4d4c-a63d-fe54984578cc
Karatzas, Dimosthenis
4d7e3927-2252-4039-88a4-0daca766e943
Bridson, David
9728123d-b09b-4f04-bd3e-16341d8bf090
Bunke, Horst
Spitz, A.L.

Antonacopoulos, Apostolos, Karatzas, Dimosthenis and Bridson, David (2006) Ground Truth for Layout Analysis Performance Evaluation. In, Bunke, Horst and Spitz, A.L. (eds.) Document Analysis Systems VII, Springer Lecture Notes in Computer Science, LNCS 3872. pp. 302-311.

Record type: Book Section

Abstract

Over the past two decades a significant number of layout analysis (page segmentation and region classification) approaches have been proposed in the literature. Each approach has been devised for and/or evaluated using (usually small) application-specific datasets. While the need for objective performance evaluation of layout analysis algorithms is evident, there does not exist a suitable dataset with ground truth that reflects the realities of everyday documents (widely varying layouts, complex entities, colour, noise etc.). The most significant impediment is the creation of accurate and flexible (in representation) ground truth, a task that is costly and must be carefully designed. This paper discusses the issues related to the design, representation and creation of ground truth in the context of a realistic dataset developed by the authors. The effectiveness of the ground truth discussed in this paper has been successfully shown in its use for two international page segmentation competitions (ICDAR2003 and ICDAR2005).

Text
DAS2006_Antonacopoulos_nonLNCS_version.pdf - Other
Download (683kB)

More information

Published date: 2006
Keywords: ground truth, layout analysis, colour document analysis, performance evaluation
Organisations: Electronics & Computer Science

Identifiers

Local EPrints ID: 263543
URI: http://eprints.soton.ac.uk/id/eprint/263543
PURE UUID: 3ab92505-665a-4c77-b42a-29bfcf8eea98

Catalogue record

Date deposited: 19 Feb 2007
Last modified: 14 Mar 2024 07:34

Export record

Contributors

Author: Apostolos Antonacopoulos
Author: Dimosthenis Karatzas
Author: David Bridson
Editor: Horst Bunke
Editor: A.L. Spitz

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×