The University of Southampton
University of Southampton Institutional Repository

Implementation Challenges for Nastaliq Character Recognition

Implementation Challenges for Nastaliq Character Recognition
Implementation Challenges for Nastaliq Character Recognition
Character recognition in cursive scripts or handwritten Latin script has attracted researchers’ attention recently and some research has been done in this area. Optical character recognition is the translation of optically-scanned bitmaps of printed or written text into digitally editable data files. OCRs developed for many world languages are already in use but none exists for Urdu Nastaliq – a calligraphic adaptation of the Arabic script, just as Jawi is for Malay. Urdu Nastaliq has 39 characters against Arabic 28. Each character then has 2-4 different shapes according to its position in the word: initial, medial, final and isolated. In Nastaliq, inter-word and intra-word overlapping makes optical recognition more complex. Character recognition of the Latin script is relatively easier. This paper reports research on Urdu Nastaliq OCR, discusses challenges and suggest a new solution for its implementation.
Sattar, Sohail A.
2578a10a-8656-41c0-9a86-084d630b8443
Haque, Shamsul
940febb7-01da-44e8-849f-33adedb50cb7
Pathan, Mahmood K.
f46e2e39-9583-4c76-96e1-3a54e5cefa1a
Gee, Quintin
ac0f464c-c192-4806-9c95-f0a866415c16
Sattar, Sohail A.
2578a10a-8656-41c0-9a86-084d630b8443
Haque, Shamsul
940febb7-01da-44e8-849f-33adedb50cb7
Pathan, Mahmood K.
f46e2e39-9583-4c76-96e1-3a54e5cefa1a
Gee, Quintin
ac0f464c-c192-4806-9c95-f0a866415c16

Sattar, Sohail A., Haque, Shamsul, Pathan, Mahmood K. and Gee, Quintin (2008) Implementation Challenges for Nastaliq Character Recognition. International Multi Topic Conference (IMTIC'08), Jamshoro, Sindh, Pakistan. 11 - 12 Apr 2008. (Submitted)

Record type: Conference or Workshop Item (Paper)

Abstract

Character recognition in cursive scripts or handwritten Latin script has attracted researchers’ attention recently and some research has been done in this area. Optical character recognition is the translation of optically-scanned bitmaps of printed or written text into digitally editable data files. OCRs developed for many world languages are already in use but none exists for Urdu Nastaliq – a calligraphic adaptation of the Arabic script, just as Jawi is for Malay. Urdu Nastaliq has 39 characters against Arabic 28. Each character then has 2-4 different shapes according to its position in the word: initial, medial, final and isolated. In Nastaliq, inter-word and intra-word overlapping makes optical recognition more complex. Character recognition of the Latin script is relatively easier. This paper reports research on Urdu Nastaliq OCR, discusses challenges and suggest a new solution for its implementation.

Text
ASattar_85.doc - Version of Record
Download (339kB)

More information

Submitted date: July 2008
Additional Information: Event Dates: 11-12 April 2008
Venue - Dates: International Multi Topic Conference (IMTIC'08), Jamshoro, Sindh, Pakistan, 2008-04-11 - 2008-04-12
Organisations: Electronics & Computer Science

Identifiers

Local EPrints ID: 266510
URI: http://eprints.soton.ac.uk/id/eprint/266510
PURE UUID: d8527fe2-7cc2-4dfd-8f2a-9c57f90906f3

Catalogue record

Date deposited: 05 Aug 2008 08:56
Last modified: 14 Mar 2024 08:29

Export record

Contributors

Author: Sohail A. Sattar
Author: Shamsul Haque
Author: Mahmood K. Pathan
Author: Quintin Gee

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×