A.J. van Hessen
- Pre-processing input text: improving pronunciation for the fluent Dutch text-to-speech synthesizer
- Book/source title
- Institute of Phonetic Sciences proceedings 22
- Pages (from-to)
- Amsterdam: Institute of Phonetic Sciences, University of Amsterdam
- Document type
- Conference contribution
- Faculty of Humanities (FGw)
- Amsterdam Center for Language and Communication (ACLC)
To improve pronunciation of the Fluent Dutch Text-To-Speech Synthesiser, two pre-processors were built that try to detect problematic cases in input texts and solve these automatically if possible. One pre-processor examines the pronounceability of surnames and company names by checking whether their initial and final two-letter combinations can be handled by the grapheme-to-phoneme rules of the Fluency TTS system, and correcting those automatically when and if possible. Also, common disambiguous abbreviations are properly expanded. The second pre-processor tries to realise pronounceable forms for numbers that do not have a straightforward pronunciation. Structural and contextual information is used in an attempt to determine to what category a number belongs, and each number is expanded according to the pronunciation conventions of its category. It can be said that these pre-processors are a useful aid in offline pronounceability examination (for names) and improvement of performance at run-time (for numbers), although ambiguity and redundancy in the input text illustrate the need for semantic and syntactic parsing to approach human text interpretation skills.
If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library, or send a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.