Benchmarking Named Entity Recognition Approaches for Extracting Research Infrastructure Information from Text

Open Access
Authors
  • G. Cheirmpos
  • S.A. Tabatabaei
  • E. Kanoulas
  • G. Tsatsaronis
Publication date 2024
Host editors
  • G. Nicosia
  • V. Ojha
  • E. La Malfa
  • G. La Malfa
  • P.M. Pardalos
  • R. Umeton
Book title Machine Learning, Optimization, and Data Science
Book subtitle 9th International Conference, LOD 2023, Grasmere, UK, September 22–26, 2023 : revised selected papers
ISBN
  • 9783031539688
ISBN (electronic)
  • 9783031539695
Series Lecture Notes in Computer Science
Event 9th International Conference on Machine Learning Optimization Data Science
Volume | Issue number I
Pages (from-to) 131–141
Publisher Cham: Springer
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Named entity recognition (NER) is an important component of many information extraction and linking pipelines. The task is especially challenging in a low-resource scenario, where there is very limited amount of high quality annotated data. In this paper we benchmark machine learning approaches for NER that may be very effective in such cases, and compare their performance in a novel application; information extraction of research infrastructure from scientific manuscripts. We explore approaches such as incorporating Contrastive Learning (CL), as well as Conditional Random Fields (CRF) weights in BERT-based architectures and demonstrate experimentally that such combinations are very efficient in few-shot learning set-ups, verifying similar findings that have been reported in other areas of NLP, as well as Computer Vision. More specifically, we show that the usage of CRF weights in BERT-based architectures achieves noteworthy improvements in the overall NER task by approximately 12%, and that in few-shot setups the effectiveness of CRF weights is much higher in smaller training sets.
Document type Conference contribution
Language English
Published at https://doi.org/10.1007/978-3-031-53969-5_11
Downloads
Permalink to this page
Back