AuthorsS. Wichmann, E.W. Holman, D. Bakker, C.H. Brown
TitleEvaluating linguistic distance measure
JournalPhysica. A : Statistical Mechanics and its Applications
FacultyFaculty of Humanities
Institute/dept.FGw: Amsterdam Center for Language and Communication (ACLC)
AbstractIn Ref. [13], Petroni and Serva discuss the use of Levenshtein distances (LD) between words referring to the same concepts as a tool for establishing overall distances among languages which can then subsequently be used to derive phylogenies. The authors modify the raw LD by dividing the LD by the length of the longer of the two words compared, to produce what could be called LDN (normalized LD). Other scholars [7] and [8] have used a further modification, where they divide the LDN by the average LDN among words not referring to the same concept. This produces what could be called LDND. The authors of Ref. [13] question whether LDND is a more adequate measure of distance than LDN. Here we show empirically that LDND is the better measure in the situation where the languages compared have not already been shown, by other, more traditional methods of comparative linguistics, to be related. If automated language classification is to be used as a tool independent of traditional methods then the further modification is necessary.
Document typeArticle
