The University of Southampton
University of Southampton Institutional Repository

Text Extraction from Web Images Based on A Split-and-Merge Segmentation Method Using Color Perception

Text Extraction from Web Images Based on A Split-and-Merge Segmentation Method Using Color Perception
Text Extraction from Web Images Based on A Split-and-Merge Segmentation Method Using Color Perception
This paper describes a complete approach to the segmentation and extraction of text from Web images for subsequent recognition, to ultimately achieve both effective indexing and presentation by non-visual means (e.g., audio). The method described here (the first in the authors’ systematic approach to exploit human colour perception) enables the extraction of text in complex situations such as in the presence of varying colour (characters and background). More precisely, in addition to using structural features, the segmentation follows a split-and-merge strategy based on the Hue-Lightness- Saturation (HLS) representation of colour as a first approximation of an anthropocentric expression of the differences in chromaticity and lightness. Character-like components are then extracted as forming textlines in a number of orientations and along curves.
text extraction, web document analysis, colour perception
634-637
Karatzas, Dimosthenis
4d7e3927-2252-4039-88a4-0daca766e943
Antonacopoulos, Apostolos
9369bee5-b30f-4d4c-a63d-fe54984578cc
Karatzas, Dimosthenis
4d7e3927-2252-4039-88a4-0daca766e943
Antonacopoulos, Apostolos
9369bee5-b30f-4d4c-a63d-fe54984578cc

Karatzas, Dimosthenis and Antonacopoulos, Apostolos (2004) Text Extraction from Web Images Based on A Split-and-Merge Segmentation Method Using Color Perception. 17th International Conference on Pattern Recognition (ICPR2004), Cambridge, United Kingdom. 23 - 26 Aug 2004. pp. 634-637 .

Record type: Conference or Workshop Item (Paper)

Abstract

This paper describes a complete approach to the segmentation and extraction of text from Web images for subsequent recognition, to ultimately achieve both effective indexing and presentation by non-visual means (e.g., audio). The method described here (the first in the authors’ systematic approach to exploit human colour perception) enables the extraction of text in complex situations such as in the presence of varying colour (characters and background). More precisely, in addition to using structural features, the segmentation follows a split-and-merge strategy based on the Hue-Lightness- Saturation (HLS) representation of colour as a first approximation of an anthropocentric expression of the differences in chromaticity and lightness. Character-like components are then extracted as forming textlines in a number of orientations and along curves.

Text
ICPR2004_Karatzas.pdf - Other
Download (506kB)

More information

Published date: 2004
Additional Information: Event Dates: August 23-26, 2004
Venue - Dates: 17th International Conference on Pattern Recognition (ICPR2004), Cambridge, United Kingdom, 2004-08-23 - 2004-08-26
Keywords: text extraction, web document analysis, colour perception
Organisations: Electronics & Computer Science

Identifiers

Local EPrints ID: 263524
URI: http://eprints.soton.ac.uk/id/eprint/263524
PURE UUID: 0520cf94-f4e1-448a-921b-bebfdce04c4f

Catalogue record

Date deposited: 19 Feb 2007
Last modified: 14 Mar 2024 07:34

Export record

Contributors

Author: Dimosthenis Karatzas
Author: Apostolos Antonacopoulos

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×