The UvA-LINKER will give you a range of other options to find the full text of a publication (including a direct link to the full-text if it is located on another database on the internet).
De UvA-LINKER biedt mogelijkheden om een publicatie elders te vinden (inclusief een directe link naar de publicatie online als deze beschikbaar is in een database op het internet).
1 to 1 of 1
| Authors||H. Hassan, K. Sima'an, A. Way|
|Title||Syntactically lexicalized phrase-based SMT|
|Journal||IEEE Transactions on Audio, Speech and Language Processing|
|Faculty||Faculty of Science|
|Institute/dept.||FNWI: Institute for Logic, Language and Computation (ILLC)|
|Abstract||Until quite recently, extending phrase-based statistical machine translation (PBSMT) with syntactic knowledge caused system performance to deteriorate. The most recent successful enrichments of PBSMT with hierarchical structure either employ nonlinguistically motivated syntax for capturing hierarchical reordering phenomena, or extend the phrase translation table with redundantly ambiguous syntactic structures over phrase pairs. In this paper, we present an extended, harmonized account of our previous work which showed that incorporating linguistically motivated lexical syntactic descriptions, called supertags, can yield significantly better PBSMT systems at insignificant extra computational cost. We describe a novel PBSMT model that integrates supertags into the target language model and the target side of the translation model. Two kinds of supertags are employed: those from lexicalized tree-adjoining grammar and combinatory categorial grammar. Despite the differences between the two sets of supertags, they give similar improvements. In addition to integrating the Markov supertagging approach in PBSMT, we explore the utility of a new surface grammaticality measure based on combinatory operators. We perform various experiments on the Arabic-to-English NIST 2005 test set addressing the issues of sparseness, scalability, and the utility of system subcomponents. We show that even when the parallel training data grows very large, the supertagged system retains a relatively stable absolute performance advantage over the unadorned PBSMT system. Arguably, this hints at a performance gap that cannot be bridged by acquiring more phrase pairs. Our best result shows a relative improvement of 6.1% over a state-of-the-art PBSMT model, which compares favorably with the leading systems on the NIST 2005 task. We also demonstrate that the advantages of a supertag-based system carry over to German-English, where improvements of up to 8.9% relative to the baseline system are observed.|
Use this url to link to this page: http://dare.uva.nl/en/record/300764
Contact us about this recordNotify a colleague
Add to bookbag