The University of Southampton
University of Southampton Institutional Repository

Bayesian reordering model with feature selection

Bayesian reordering model with feature selection
Bayesian reordering model with feature selection
In phrase-based statistical machine translation systems, variation in grammatical structures between source and target languages can cause large movements of phrases. Modeling such movements is crucial in achieving translations of long sentences that appear natural in the target language. We explore generative learning approach to phrase reordering in Arabic to English. Formulating the reordering problem as a classification problem and using naive Bayes with feature selection, we achieve an improvement in the BLEU score over a lexicalized reordering model. The proposed model is compact, fast and scalable to a large corpus.
477-485
Alrajeh, Abdullah
64acb5ae-6e8e-44a0-9afa-edd1658cd0cd
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Alrajeh, Abdullah
64acb5ae-6e8e-44a0-9afa-edd1658cd0cd
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f

Alrajeh, Abdullah and Niranjan, Mahesan (2014) Bayesian reordering model with feature selection. ACL2014: The Ninth Workshop on Statistical Machine Translation, Baltimore, United States. 26 - 27 Jun 2014. pp. 477-485 .

Record type: Conference or Workshop Item (Paper)

Abstract

In phrase-based statistical machine translation systems, variation in grammatical structures between source and target languages can cause large movements of phrases. Modeling such movements is crucial in achieving translations of long sentences that appear natural in the target language. We explore generative learning approach to phrase reordering in Arabic to English. Formulating the reordering problem as a classification problem and using naive Bayes with feature selection, we achieve an improvement in the BLEU score over a lexicalized reordering model. The proposed model is compact, fast and scalable to a large corpus.

Text
wmt14.pdf - Other
Download (423kB)
Text
W14-3361.pdf - Author's Original
Download (231kB)

More information

Published date: June 2014
Venue - Dates: ACL2014: The Ninth Workshop on Statistical Machine Translation, Baltimore, United States, 2014-06-26 - 2014-06-27
Organisations: Southampton Wireless Group

Identifiers

Local EPrints ID: 366149
URI: http://eprints.soton.ac.uk/id/eprint/366149
PURE UUID: 53e9cc83-bef2-4e44-af8b-a0cee920d10e
ORCID for Mahesan Niranjan: ORCID iD orcid.org/0000-0001-7021-140X

Catalogue record

Date deposited: 26 Jun 2014 13:33
Last modified: 15 Mar 2024 03:29

Export record

Contributors

Author: Abdullah Alrajeh
Author: Mahesan Niranjan ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×