Domain-specific Evaluation of Word Embeddings for Philosophical Text using Direct Intrinsic Evaluation

Authors	G. van Boven J. Bloem
Publication date	2022
Host editors	M. Hämäläinen K. Alnajjar N. Partanen J. Rueter
Book title	The 2nd International Workshop on Natural Language Processing for Digital Humanities
Book subtitle	proceedings of the workshop : NLP4DH 2021 : November 20, 2022
ISBN (electronic)	9781955917759
Event	2nd International Workshop on Natural Language Processing for Digital Humanities (NLP4DH)
Pages (from-to)	101-107
Number of pages	7
Publisher	Stroudsburg, PA: Association for Computational Linguistics
Organisations	Interfacultary Research - Institute for Logic, Language and Computation (ILLC) Faculty of Humanities (FGw) - Amsterdam Institute for Humanities Research (AIHR)
Abstract	We perform a direct intrinsic evaluation of word embeddings trained on the works of a single philosopher. Six models are compared to human judgements elicited using two tasks: a synonym detection task and a coherence task. We apply a method that elicits judgements based on explicit knowledge from experts, as the linguistic intuition of non-expert participants might differ from that of the philosopher. We find that an in-domain SVD model has the best 1-nearest neighbours for target terms, while transfer learning-based Nonce2Vec performs better for low frequency target terms.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.18653/v1/2022.nlp4dh-1.14 (Final published version)
Downloads	2022.nlp4dh-1.14 (Final published version)
Permalink to this page

Back

UvA-DARE