Domain-specific Evaluation of Word Embeddings for Philosophical Text using Direct Intrinsic Evaluation

Open Access
Authors
Publication date 2022
Host editors
  • M. Hämäläinen
  • K. Alnajjar
  • N. Partanen
  • J. Rueter
Book title The 2nd International Workshop on Natural Language Processing for Digital Humanities
Book subtitle proceedings of the workshop : NLP4DH 2021 : November 20, 2022
ISBN (electronic)
  • 9781955917759
Event 2nd International Workshop on Natural Language Processing for Digital Humanities (NLP4DH)
Pages (from-to) 101-107
Number of pages 7
Publisher Stroudsburg, PA: Association for Computational Linguistics
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
  • Faculty of Humanities (FGw) - Amsterdam Institute for Humanities Research (AIHR)
Abstract We perform a direct intrinsic evaluation of word embeddings trained on the works of a single philosopher. Six models are compared to human judgements elicited using two tasks: a synonym detection task and a coherence task. We apply a method that elicits judgements based on explicit knowledge from experts, as the linguistic intuition of non-expert participants might differ from that of the philosopher. We find that an in-domain SVD model has the best 1-nearest neighbours for target terms, while transfer learning-based Nonce2Vec performs better for low frequency target terms.
Document type Conference contribution
Language English
Published at https://doi.org/10.18653/v1/2022.nlp4dh-1.14
Downloads
2022.nlp4dh-1.14 (Final published version)
Permalink to this page
Back