Dissecting incongruity Metaphor and humor understanding of large language models
| Authors | |
|---|---|
| Supervisors | |
| Cosupervisors | |
| Award date | 03-07-2026 |
| ISBN |
|
| Series | ILLC Dissertation Series, DS-2026-09 |
| Number of pages | 168 |
| Organisations |
|
| Abstract |
Metaphor and humor are indispensable parts of human cognition and communication, yet they can pose challenges to large language models (LLMs). This thesis proposes resources and evaluation frameworks for LLMs’ capabilities of processing metaphor and humor. (i) Paraphrasing of linguistic metaphors: I manually construct ~1,500 test sets involving inapt paraphrases and evaluate LLMs on two paraphrasing tasks. The apt and inapt paraphrases correspond to the target and source domains of the metaphors respectively. (ii) Metaphor intentions: I evaluate LLMs’ capabilities to infer the intentions behind metaphor use via a dataset of ~1,000 metaphor samples I co-annotated. The annotation is based on a novel taxonomy of nine intention categories I co-developed. (iii) Humorous multimodal metaphor use: I develop a novel annotation scheme and annotate humorous multimodal metaphor use in 1,000 New Yorker cartoons. Using the dataset, I test multimodal LLMs on a suite of six tasks. Our experiments reveal challenges LLMs face in various metaphor and humor processing scenarios and suggest solutions to the problems. (iv) Cultural differences in humor appreciation: I establish human baselines by collecting 25,600 funniness ratings and emotional reaction annotations for 800 New Yorker cartoons from four cultures (Chinese, Mexican, Polish, and the U.S). My data analysis reveals general patterns and intricacies of what is considered humorous in different cultures, how humor appreciation is associated with emotional reactions, and how metaphor may affect humor appreciation depending on the culture.
|
| Document type | PhD thesis |
| Language | English |
| Downloads | |
| Permalink to this page | |
