The University of Southampton
University of Southampton Institutional Repository

Large-scale reordering model for statistical machine translation using dual multinomial logistic regression

Large-scale reordering model for statistical machine translation using dual multinomial logistic regression
Large-scale reordering model for statistical machine translation using dual multinomial logistic regression
Phrase reordering is a challenge for statistical machine translation systems. Posing phrase movements as a prediction problem using contextual features modeled by maximum entropy-based classifier is superior to the commonly used lexicalized reordering model. However, Training this discriminative model using large-scale parallel corpus might be computationally expensive. In this paper, we explore recent advancements in solving large-scale classification problems. Using the dual problem to multinomial logistic regression, we managed to shrink the training data while iterating and produce significant saving in computation and memory while preserving the accuracy.
Alrajeh, Abdullah
64acb5ae-6e8e-44a0-9afa-edd1658cd0cd
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Alrajeh, Abdullah
64acb5ae-6e8e-44a0-9afa-edd1658cd0cd
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f

Alrajeh, Abdullah and Niranjan, Mahesan (2014) Large-scale reordering model for statistical machine translation using dual multinomial logistic regression. Empirical Methods on Natural Language Processing 2014, Doha, Qatar. 25 - 29 Oct 2014.

Record type: Conference or Workshop Item (Paper)

Abstract

Phrase reordering is a challenge for statistical machine translation systems. Posing phrase movements as a prediction problem using contextual features modeled by maximum entropy-based classifier is superior to the commonly used lexicalized reordering model. However, Training this discriminative model using large-scale parallel corpus might be computationally expensive. In this paper, we explore recent advancements in solving large-scale classification problems. Using the dual problem to multinomial logistic regression, we managed to shrink the training data while iterating and produce significant saving in computation and memory while preserving the accuracy.

Text
emnlp14.pdf - Author's Original
Download (205kB)
Text
emnlp-poster.pdf - Other
Download (174kB)

More information

Published date: October 2014
Venue - Dates: Empirical Methods on Natural Language Processing 2014, Doha, Qatar, 2014-10-25 - 2014-10-29
Related URLs:
Organisations: Electronics & Computer Science

Identifiers

Local EPrints ID: 367606
URI: http://eprints.soton.ac.uk/id/eprint/367606
PURE UUID: 2a8a176c-4847-465d-806b-63486e5f3328
ORCID for Mahesan Niranjan: ORCID iD orcid.org/0000-0001-7021-140X

Catalogue record

Date deposited: 20 Aug 2014 15:30
Last modified: 15 Mar 2024 03:29

Export record

Contributors

Author: Abdullah Alrajeh
Author: Mahesan Niranjan ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×