- Explorations in automated language classification
- Folia Linguistica
- Volume | Issue number
- 42 | 2
- Pages (from-to)
- Document type
- Faculty of Humanities (FGw)
- Amsterdam Center for Language and Communication (ACLC)
An earlier paper, to which some authors of the present paper have contributed (Brown et al. 2008), describes a method for automating language classification based on the 100-item referent list of Swadesh (1955). Here we discuss a refinement of the method, involving calculation of relative stabilities of list items and reduction of the list to a shorter one by eliminating least stable items. The result is a 40-item referent list. The method for determining stabilities is explained, as well as a method for comparing the classificatory performance of different-sized reduced lists with that of the full 100-item list. A statistical investigation of the relationship of lexical similarity of languages to their geographical proximity is presented. Finally, we test the possibility that information involving typological features of languages can be combined with lexical data to enhance classificatory accuracy.
- go to publisher's site
If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library, or send a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.