Entity centric neural models for natural language processing

Open Access
Authors
Supervisors
Award date 14-05-2024
Series ILLC Dissertation Series, DS-2024-03
Number of pages 225
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
  • Faculty of Science (FNWI)
Abstract
This thesis explores how to enhance natural language understanding by incorporating entity information into neural network models. It tackles three key questions:
1. Leveraging entities for understanding tasks: This work introduces Entity-GCN, a model that performs multi-step reasoning on a graph where nodes represent entity mentions and edges represent relationships. This method achieved state-of-the-art results on a multi-document question-answering dataset.
2. Identifying and disambiguating entities using large language models: This research proposes a novel system that retrieves entities by generating their names token-by-token, overcoming limitations of traditional methods and significantly reducing memory footprint. This approach is also extended to a multilingual setting and further optimized for efficiency.
3. Interpreting and controlling entity knowledge within models: This thesis presents a post-hoc interpretation technique to analyze how decisions are made across layers in neural models, allowing for visualization and analysis of knowledge representation. Additionally, a method for editing factual knowledge about entities is proposed, enabling correction of model predictions without costly retraining.
Document type PhD thesis
Language English
Downloads
Permalink to this page
cover
Back