A Distributed Inflection Model for Translating into Morphologically Rich Languages
| Authors | |
|---|---|
| Publication date | 2015 |
| Host editors |
|
| Book title | Proceedings of MT Summit XV. - Vol. 1 |
| Book subtitle | MT Researchers' Track: MT Summit XV : October 30-November 3, 2015, Miami, FL, USA |
| Event | Machine Translation Summit XV |
| Pages (from-to) | 145-159 |
| Publisher | Association for Machine Translation in the Americas |
| Organisations |
|
| Abstract |
Lexical sparsity is a major challenge for machine translation into morphologically rich languages. We address this problem by modeling sequences of fine-grained morphological tags in a bilingual context. To overcome the issue of ambiguous word analyses, we introduce soft tags, which are under-specified representations retaining all possible morphological attributes of a word. In order to learn distributed representations for the soft tags and their interactions we adopt a neural network approach. This approach allows for the combination of source and target side information to model a wide range of inflection phenomena. Our re-inflection experiments show a substantial increase in accuracy compared to a model trained on morphologically disambiguated data. Integrated into an SMT decoder and evaluated for English-Italian and English-Russian translation, our model yields improvements of up to 1.0 BLEU over a competitive baseline.
|
| Document type | Conference contribution |
| Language | English |
| Published at | http://www.mt-archive.info/15/MTS-2015-Tran.pdf http://amtaweb.org/wp-content/uploads/2015/10/MTSummitXV_ResearchTrack.pdf |
| Other links | http://www.mt-archive.info/15/MTS-2015-TOC.htm |
| Downloads |
mtsummit15
(Final published version)
|
| Permalink to this page | |