The University of Southampton
University of Southampton Institutional Repository

Text Extraction from Web Images Based on Human Perception and Fuzzy Inference

Text Extraction from Web Images Based on Human Perception and Fuzzy Inference
Text Extraction from Web Images Based on Human Perception and Fuzzy Inference
There is a significant need to extract and recognise the semantically-important text contained in images on Web pages. This paper proposes a new approach to text extraction from this special class of images. The method attempts to emulate closer than before the way humans perceive colour differences in order to differentiate between text and background regions. Pixels of similar colour (as humans see it) are merged into components and a fuzzy inference mechanism (using connectivity and colour distance features) is devised to group components into larger character-like regions.
35-38
Antonacopoulos, Apostolos
9369bee5-b30f-4d4c-a63d-fe54984578cc
Karatzas, Dimosthenis
4d7e3927-2252-4039-88a4-0daca766e943
Antonacopoulos, Apostolos
9369bee5-b30f-4d4c-a63d-fe54984578cc
Karatzas, Dimosthenis
4d7e3927-2252-4039-88a4-0daca766e943

Antonacopoulos, Apostolos and Karatzas, Dimosthenis (2001) Text Extraction from Web Images Based on Human Perception and Fuzzy Inference. First International Workshop on Web Document Analysis (WDA2001), Seattle, United States. pp. 35-38 .

Record type: Conference or Workshop Item (Paper)

Abstract

There is a significant need to extract and recognise the semantically-important text contained in images on Web pages. This paper proposes a new approach to text extraction from this special class of images. The method attempts to emulate closer than before the way humans perceive colour differences in order to differentiate between text and background regions. Pixels of similar colour (as humans see it) are merged into components and a fuzzy inference mechanism (using connectivity and colour distance features) is devised to group components into larger character-like regions.

Text
WDA2001_Antonacopoulos.pdf - Other
Download (219kB)

More information

Published date: 2001
Additional Information: Event Dates: September 2001
Venue - Dates: First International Workshop on Web Document Analysis (WDA2001), Seattle, United States, 2001-09-01
Organisations: Electronics & Computer Science

Identifiers

Local EPrints ID: 263510
URI: http://eprints.soton.ac.uk/id/eprint/263510
PURE UUID: 631c80f4-ab8d-48cf-83fc-abacceed7a77

Catalogue record

Date deposited: 19 Feb 2007
Last modified: 14 Mar 2024 07:33

Export record

Contributors

Author: Apostolos Antonacopoulos
Author: Dimosthenis Karatzas

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×