Large-scale reordering model for statistical machine translation using dual multinomial logistic regression
Large-scale reordering model for statistical machine translation using dual multinomial logistic regression
Phrase reordering is a challenge for statistical machine translation systems. Posing phrase movements as a prediction problem using contextual features modeled by maximum entropy-based classifier is superior to the commonly used lexicalized reordering model. However, Training this discriminative model using large-scale parallel corpus might be computationally expensive. In this paper, we explore recent advancements in solving large-scale classification problems. Using the dual problem to multinomial logistic regression, we managed to shrink the training data while iterating and produce significant saving in computation and memory while preserving the accuracy.
Alrajeh, Abdullah
64acb5ae-6e8e-44a0-9afa-edd1658cd0cd
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
October 2014
Alrajeh, Abdullah
64acb5ae-6e8e-44a0-9afa-edd1658cd0cd
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Alrajeh, Abdullah and Niranjan, Mahesan
(2014)
Large-scale reordering model for statistical machine translation using dual multinomial logistic regression.
Empirical Methods on Natural Language Processing 2014, Doha, Qatar.
25 - 29 Oct 2014.
Record type:
Conference or Workshop Item
(Paper)
Abstract
Phrase reordering is a challenge for statistical machine translation systems. Posing phrase movements as a prediction problem using contextual features modeled by maximum entropy-based classifier is superior to the commonly used lexicalized reordering model. However, Training this discriminative model using large-scale parallel corpus might be computationally expensive. In this paper, we explore recent advancements in solving large-scale classification problems. Using the dual problem to multinomial logistic regression, we managed to shrink the training data while iterating and produce significant saving in computation and memory while preserving the accuracy.
Text
emnlp14.pdf
- Author's Original
Text
emnlp-poster.pdf
- Other
More information
Published date: October 2014
Venue - Dates:
Empirical Methods on Natural Language Processing 2014, Doha, Qatar, 2014-10-25 - 2014-10-29
Organisations:
Electronics & Computer Science
Identifiers
Local EPrints ID: 367606
URI: http://eprints.soton.ac.uk/id/eprint/367606
PURE UUID: 2a8a176c-4847-465d-806b-63486e5f3328
Catalogue record
Date deposited: 20 Aug 2014 15:30
Last modified: 15 Mar 2024 03:29
Export record
Contributors
Author:
Abdullah Alrajeh
Author:
Mahesan Niranjan
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics