Parsing statistical machine translation output

Authors
Publication date 2009
Host editors
  • Z. Vetulani
Book title Human language technologies as a challenge for computer science and linguistics: 4th Language & Technology Conference, November, 6-8, 2009, Poznań, Poland: proceedings
ISBN
  • 9788371777462
Event 4th Language & Technology Conference (LTC'09), Poznań, Poland
Pages (from-to) 270-274
Publisher Poznan: Wydawnictwo Poznańskie
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Despite increasing research into the use of syntax during statistical machine translation, the incorporation of syntax into language models has seen limited success. We present a study of the discriminative abilities of generative syntax-based language models, over and above standard n-gram models, with a focus on potential applications for Statistical Machine Translation (SMT). We show that in fact parsers are better able to discriminate between good and bad English, and that parsers, as well as n-gram language models, assign higher average log probabilities to references in comparison to SMT output.
Document type Conference contribution
Language English
Published at http://www.scarter.org/ltc2009-CameraReady.pdf
Permalink to this page
Back