Arabic text to Arabic sign language example-based translation system
Arabic text to Arabic sign language example-based translation system
This dissertation presents the first corpus-based system for translation from Arabic text into Arabic Sign Language (ArSL) for the deaf and hearing impaired, for whom it can facilitate access to conventional media and allow communication with hearing people. In addition to the familiar technical problems of text-to-text machine translation,building a system for sign language translation requires overcoming some additional challenges. First,the lack of a standard writing system requires the building of a parallel text-to-sign language corpus from scratch, as well as computational tools to prepare this parallel corpus. Further, the corpus must facilitate output in visual form, which is clearly far more difficult than producing textual output. The time and effort involved in building such a parallel corpus of text and visual signs from scratch mean that we will inevitably be working with quite small corpora. We have constructed two parallel Arabic text-to-ArSL corpora for our system. The first was built from school level language instruction material and contains 203 signed sentences and 710 signs. The second was constructed from a children's story and contains 813 signed sentences and 2,478 signs. Working with corpora of limited size means that coverage is a huge issue. A new technique was derived to exploit Arabic morphological information to increase coverage and hence, translation accuracy. Further, we employ two different example-based translation methods and combine them to produce more accurate translation output. We have chosen to use concatenated sign video clips as output rather than a signing avatar, both for simplicity and because this allows us to distinguish more easily between translation errors and sign synthesis errors. Using leave-one-out cross-validation on our first corpus, the system produced translated sign sentence outputs with an average word error rate of 36.2% and an average position-independent error rate of 26.9%. The corresponding figures for our second corpus were an average word error rate of 44.0% and 28.1%. The most frequent source of errors is missing signs in the corpus; this could be addressed in the future by collecting more corpus material. Finally, it is not possible to compare the performance of our system with any other competing Arabic text-to-ArSL machine translation system since no other such systems exist at present.
Almohimeed, Abdulaziz
926b035d-9396-4091-a6cc-8139ebe6b1c0
13 November 2012
Almohimeed, Abdulaziz
926b035d-9396-4091-a6cc-8139ebe6b1c0
Wald, Mike
90577cfd-35ae-4e4a-9422-5acffecd89d5
Damper, R. I.
6e0e7fdc-57ec-44d4-bc0f-029d17ba441d
Almohimeed, Abdulaziz
(2012)
Arabic text to Arabic sign language example-based translation system.
University of Southampton, Faculty of Physical & Applied Sciences, Doctoral Thesis, 180pp.
Record type:
Thesis
(Doctoral)
Abstract
This dissertation presents the first corpus-based system for translation from Arabic text into Arabic Sign Language (ArSL) for the deaf and hearing impaired, for whom it can facilitate access to conventional media and allow communication with hearing people. In addition to the familiar technical problems of text-to-text machine translation,building a system for sign language translation requires overcoming some additional challenges. First,the lack of a standard writing system requires the building of a parallel text-to-sign language corpus from scratch, as well as computational tools to prepare this parallel corpus. Further, the corpus must facilitate output in visual form, which is clearly far more difficult than producing textual output. The time and effort involved in building such a parallel corpus of text and visual signs from scratch mean that we will inevitably be working with quite small corpora. We have constructed two parallel Arabic text-to-ArSL corpora for our system. The first was built from school level language instruction material and contains 203 signed sentences and 710 signs. The second was constructed from a children's story and contains 813 signed sentences and 2,478 signs. Working with corpora of limited size means that coverage is a huge issue. A new technique was derived to exploit Arabic morphological information to increase coverage and hence, translation accuracy. Further, we employ two different example-based translation methods and combine them to produce more accurate translation output. We have chosen to use concatenated sign video clips as output rather than a signing avatar, both for simplicity and because this allows us to distinguish more easily between translation errors and sign synthesis errors. Using leave-one-out cross-validation on our first corpus, the system produced translated sign sentence outputs with an average word error rate of 36.2% and an average position-independent error rate of 26.9%. The corresponding figures for our second corpus were an average word error rate of 44.0% and 28.1%. The most frequent source of errors is missing signs in the corpus; this could be addressed in the future by collecting more corpus material. Finally, it is not possible to compare the performance of our system with any other competing Arabic text-to-ArSL machine translation system since no other such systems exist at present.
More information
Published date: 13 November 2012
Organisations:
University of Southampton, Web & Internet Science
Identifiers
Local EPrints ID: 345562
URI: http://eprints.soton.ac.uk/id/eprint/345562
PURE UUID: ebb2efd8-4da4-4e28-9b3d-1ebd50baeb3a
Catalogue record
Date deposited: 31 Mar 2016 12:25
Last modified: 14 Mar 2024 12:27
Export record
Contributors
Author:
Abdulaziz Almohimeed
Thesis advisor:
Mike Wald
Thesis advisor:
R. I. Damper
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics