Recognizing and Linking Entities in Old Dutch Text: A Case Study on VOC Notary Records

Open Access
Authors
Publication date 2021
Host editors
  • A. Weber
  • M. Heerlien
  • E. Gassó Miracle
  • K. Wolstencroft
Book title Proceedings of the International Conference Collect and Connect: Archives and Collections in a Digital Age
Book subtitle Leiden, the Netherlands, November 23-24, 2020
Series CEUR Workshop Proceedings
Event 2020 International Conference Collect and Connect: Archives and Collections in a Digital Age, COLCO 2020
Pages (from-to) 25-36
Number of pages 12
Publisher Aachen: CEUR-WS
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract

The increased availability of digitised historical archives allows researchers to discover detailed information about people and companies from the past. However, the unconnected nature of these datasets presents a non-trivial challenge. In this paper, we present an approach and experiments to recognise person names in digitised notary records and link them to their job registration in the Dutch East India company’s records. Our approach shows that standard state-of-the-art language models have difficulties dealing with 18th century texts. However a small amount of domain adaption can improve the connection of information on sailors from different archives.

Document type Conference contribution
Language English
Published at http://ceur-ws.org/Vol-2810/paper3.pdf
Other links http://ceur-ws.org/Vol-2810/ https://www.scopus.com/pages/publications/85101217352
Downloads
paper3-1 (Final published version)
Permalink to this page
Back