Scientific and Creative Analogies in Pretrained Language Models
| Authors |
|
|---|---|
| Publication date | 2022 |
| Host editors |
|
| Book title | Findings of the Association for Computational Linguistics: EMNLP 2022 |
| Book subtitle | Conference on Empirical Methods in Natural Language Processing (EMNLP), Abu Dhabi, United Arab Emirates, 7-11 December 2022 |
| Event | The 2022 Conference on Empirical Methods in Natural Language Processing |
| Pages (from-to) | 2094-2100 |
| Number of pages | 7 |
| Publisher | Stroudsburg, PA: Association for Computational Linguistics |
| Organisations |
|
| Abstract |
This paper examines the encoding of analogy in large-scale pretrained language models, such as BERT and GPT-2. Existing analogy datasets typically focus on a limited set of analogical relations, with a high similarity of the two domains between which the analogy holds. As a more realistic setup, we introduce the Scientific and Creative Analogy dataset (SCAN), a novel analogy dataset containing systematic mappings of multiple attributes and relational structures across dissimilar domains. Using this dataset, we test the analogical reasoning capabilities of several widely-used pretrained language models (LMs). We find that state-of-the-art LMs achieve low performance on these complex analogy tasks, highlighting the challenges still posed by analogy understanding. |
| Document type | Conference contribution |
| Note | With supplementary video |
| Language | English |
| Published at | https://doi.org/10.18653/v1/2022.findings-emnlp.153 |
| Other links | https://www.scopus.com/pages/publications/85149846242 |
| Downloads |
2022.findings-emnlp.153
(Final published version)
|
| Supplementary materials | |
| Permalink to this page | |