Entity centric neural models for natural language processing
| Authors | |
|---|---|
| Supervisors | |
| Award date | 14-05-2024 |
| Series | ILLC Dissertation Series, DS-2024-03 |
| Number of pages | 225 |
| Organisations |
|
| Abstract |
This thesis explores how to enhance natural language understanding by incorporating entity information into neural network models. It tackles three key questions:
1. Leveraging entities for understanding tasks: This work introduces Entity-GCN, a model that performs multi-step reasoning on a graph where nodes represent entity mentions and edges represent relationships. This method achieved state-of-the-art results on a multi-document question-answering dataset. 2. Identifying and disambiguating entities using large language models: This research proposes a novel system that retrieves entities by generating their names token-by-token, overcoming limitations of traditional methods and significantly reducing memory footprint. This approach is also extended to a multilingual setting and further optimized for efficiency. 3. Interpreting and controlling entity knowledge within models: This thesis presents a post-hoc interpretation technique to analyze how decisions are made across layers in neural models, allowing for visualization and analysis of knowledge representation. Additionally, a method for editing factual knowledge about entities is proposed, enabling correction of model predictions without costly retraining. |
| Document type | PhD thesis |
| Language | English |
| Downloads | |
| Permalink to this page | |
