Colour Text Segmentation in Web Images Based on Human Perception
Colour Text Segmentation in Web Images Based on Human Perception
There is a significant need to extract and analyse the text in images on Web documents, for effective indexing, semantic analysis and even presentation by non-visual means (e.g., audio). This paper argues that the challenging segmentation stage for such images benefits from a human perspective of colour perception in preference to RGB colour space analysis. The proposed approach enables the segmentation of text in complex situations such as in the presence of varying colour and texture (characters and background). More precisely, characters are segmented as distinct regions with separate chromaticity and/or lightness by performing a layer decomposition of the image. The method described here is a result of the authors’ systematic approach to approximate the human colour perception characteristics for the identification of character regions. In this instance, the image is decomposed by performing histogram analysis of Hue and Lightness in the HLS colour space and merging using information on human discrimination of wavelength and luminance.
Character segmentation, text extraction, web document analysis, web images, image analysis, colour perception
Karatzas, Dimosthenis
4d7e3927-2252-4039-88a4-0daca766e943
Antonacopoulos, Apostolos
9369bee5-b30f-4d4c-a63d-fe54984578cc
2007
Karatzas, Dimosthenis
4d7e3927-2252-4039-88a4-0daca766e943
Antonacopoulos, Apostolos
9369bee5-b30f-4d4c-a63d-fe54984578cc
Karatzas, Dimosthenis and Antonacopoulos, Apostolos
(2007)
Colour Text Segmentation in Web Images Based on Human Perception.
Image and Vision Computing.
Abstract
There is a significant need to extract and analyse the text in images on Web documents, for effective indexing, semantic analysis and even presentation by non-visual means (e.g., audio). This paper argues that the challenging segmentation stage for such images benefits from a human perspective of colour perception in preference to RGB colour space analysis. The proposed approach enables the segmentation of text in complex situations such as in the presence of varying colour and texture (characters and background). More precisely, characters are segmented as distinct regions with separate chromaticity and/or lightness by performing a layer decomposition of the image. The method described here is a result of the authors’ systematic approach to approximate the human colour perception characteristics for the identification of character regions. In this instance, the image is decomposed by performing histogram analysis of Hue and Lightness in the HLS colour space and merging using information on human discrimination of wavelength and luminance.
Text
IMAVIS2006_Karatzas.pdf
- Other
More information
Published date: 2007
Keywords:
Character segmentation, text extraction, web document analysis, web images, image analysis, colour perception
Organisations:
Electronics & Computer Science
Identifiers
Local EPrints ID: 263554
URI: http://eprints.soton.ac.uk/id/eprint/263554
ISSN: 0262-8856
PURE UUID: cc443955-8ac5-46cb-8356-04c9f1ea9d65
Catalogue record
Date deposited: 19 Feb 2007
Last modified: 14 Mar 2024 07:34
Export record
Contributors
Author:
Dimosthenis Karatzas
Author:
Apostolos Antonacopoulos
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics