The University of Southampton
University of Southampton Institutional Repository

Accessing Textual Information Embedded in Internet Images

Antonacopoulos, Apostolos, Karatzas, Dimosthenis and Ortiz Lopez, J (2001) Accessing Textual Information Embedded in Internet Images At SPIE, Internet Imaging II, United States. , pp. 198-205.

Record type: Conference or Workshop Item (Paper)

Abstract

Indexing and searching for WWW pages is relying on analysing text. Current technology cannot process the text embedded in images on WWW pages. This paper argues that this is a significant problem as text in image form is usually semantically important (e.g. headers, titles). The results of a recent study are presented to show that the majority (76%) of words embedded in images do not appear elsewhere in the main text and that the majority (56%) of ALT tag descriptions of images are incorrect or do not exist at all. Research under way to devise tools to extract text from images based on the way humans perceive colour differences is outlined and results are presented.

PDF SPIE2001_Antonacopoulos.pdf - Other
Download (125kB)

More information

Published date: 2001
Additional Information: Event Dates: January 2001
Venue - Dates: SPIE, Internet Imaging II, United States, 2001-01-01
Keywords: Web document analysis, image analysis, text extraction
Organisations: Electronics & Computer Science

Identifiers

Local EPrints ID: 263506
URI: http://eprints.soton.ac.uk/id/eprint/263506
PURE UUID: 582e7d37-70da-4992-b760-147a5ba5f998

Catalogue record

Date deposited: 19 Feb 2007
Last modified: 18 Jul 2017 07:45

Export record

Contributors

Author: Apostolos Antonacopoulos
Author: Dimosthenis Karatzas
Author: J Ortiz Lopez

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×