Distributional Semantics for Neo-Latin

J. Bloem; M.C. Parisi; M. Reynaert; Y. Oortwijn; A. Betti

Distributional Semantics for Neo-Latin

Authors	J. Bloem M.C. Parisi M. Reynaert Y. Oortwijn A. Betti
Publication date	2020
Host editors	R. Sprugnoli M. Passarotti
Book title	1st Workshop on Language Technologies for Historical and Ancient Languages, (LT4HALA 2020)
Book subtitle	Proceedings : LREC 2020 Workshop, Language Resources and Evaluation Conference, 11–16 May 2020
ISBN (electronic)	9791095546535
Event	12th International Conference on Language Resources and Evaluation, LREC 2020
Pages (from-to)	84-93
Number of pages	10
Publisher	Paris: European Language Resources Association (ELRA)
Organisations	Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract	We address the problem of creating and evaluating quality Neo-Latin word embeddings for the purpose of philosophical research, adapting the Nonce2Vec tool to learn embeddings from Neo-Latin sentences. This distributional semantic modeling tool can learn from tiny data incrementally, using a larger background corpus for initialization. We conduct two evaluation tasks: definitional learning of Latin Wikipedia terms, and learning consistent embeddings from 18th century Neo-Latin sentences pertaining to the concept of mathematical method. Our results show that consistent Neo-Latin word embeddings can be learned from this type of data. While our evaluation results are promising, they do not reveal to what extent the learned models match domain expert knowledge of our Neo-Latin texts. Therefore, we propose an additional evaluation method, grounded in expert-annotated data, that would assess whether learned representations are conceptually sound in relation to the domain of study.
Document type	Conference contribution
Language	English
Related dataset	Neo-Latin word embeddings and preprocessed text
Published at	https://www.aclweb.org/anthology/2020.lt4hala-1.13 http://www.lrec-conf.org/proceedings/lrec2020/workshops/LT4HALA/pdf/2020.lt4hala-1.13.pdf
Other links	http://www.lrec-conf.org/proceedings/lrec2020/workshops/LT4HALA/index.html
Downloads	2020.lt4hala-1.13-1 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Distributional Semantics for Neo-Latin