Permutation forests for modeling word order in machine translation
| Authors | |
|---|---|
| Supervisors | |
| Cosupervisors | |
| Award date | 13-12-2017 |
| Number of pages | 148 |
| Organisations |
|
| Abstract |
In natural language, there is only a limited space for variation in the word order of linguistic productions. From a linguistic perspective, word order is the result of multiple application of syntactic recursive functions. These syntactic operations produce hierarchical syntactic structures, as well as a string of words that appear in a certain order.
However, different languages are governed by different syntactic rules. Thus, one of the main problems in machine translation is to find the mapping between word order in the source language and word order in the target language. This is often done by a method of syntactic transfer, in which the syntactic tree is recovered from the source sentence, and then transduced so that its form is consistent with the syntactic rules of the target language. In this dissertation, I propose an alternative to syntactic transfer that maintains its good properties – namely the compositional and hierarchical structure – but, unlike syntactic transfer, it is directly derived from data without requiring any linguistic annotation. This approach brings two main advantages. First, it allows for applying hierarchical reordering even on languages for which there are no syntactic parsers available. Second, unlike the trees used in syntactic transfer which in some cases cannot cover the reordering patterns present in the data, the trees used in this work are built directly over the reordering patterns, so they can cover them by definition. |
| Document type | PhD thesis |
| Note | ILLC Dissertation Series DS-2017-09 |
| Language | English |
| Downloads | |
| Permalink to this page | |
