A Distributed Inflection Model for Translating into Morphologically Rich Languages

Open Access
Authors
Publication date 2015
Host editors
  • Y. Al-Onaizan
  • W. Lewis
Book title Proceedings of MT Summit XV. - Vol. 1
Book subtitle MT Researchers' Track: MT Summit XV : October 30-November 3, 2015, Miami, FL, USA
Event Machine Translation Summit XV
Pages (from-to) 145-159
Publisher Association for Machine Translation in the Americas
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Lexical sparsity is a major challenge for machine translation into morphologically rich languages. We address this problem by modeling sequences of fine-grained morphological tags in a bilingual context. To overcome the issue of ambiguous word analyses, we introduce soft tags, which are under-specified representations retaining all possible morphological attributes of a word. In order to learn distributed representations for the soft tags and their interactions we adopt a neural network approach. This approach allows for the combination of source and target side information to model a wide range of inflection phenomena. Our re-inflection experiments show a substantial increase in accuracy compared to a model trained on morphologically disambiguated data. Integrated into an SMT decoder and evaluated for English-Italian and English-Russian translation, our model yields improvements of up to 1.0 BLEU over a competitive baseline.
Document type Conference contribution
Language English
Published at http://www.mt-archive.info/15/MTS-2015-Tran.pdf http://amtaweb.org/wp-content/uploads/2015/10/MTSummitXV_ResearchTrack.pdf
Other links http://www.mt-archive.info/15/MTS-2015-TOC.htm
Downloads
mtsummit15 (Final published version)
Permalink to this page
Back