Induction Heads as an Essential Mechanism for Pattern Matching in In-context Learning

Open Access
Authors
Publication date 2025
Host editors
  • Luis Chiruzzo
  • Alan Ritter
  • Lu Wang
Book title Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics : Proceedings of the Conference : Findings
Book subtitle NAACL 2025 : April 29-May 4, 2025
ISBN (electronic)
  • 9798891761957
Event 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics, NAACL 2025
Pages (from-to) 5049–5111
Publisher Kerrville, TX: Association for Computational Linguistics
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract

Large language models (LLMs) have shown a remarkable ability to learn and perform complex tasks through in-context learning (ICL). However, a comprehensive understanding of its internal mechanisms is still lacking. This paper explores the role of induction heads in a few-shot ICL setting. We analyse two state-of-the-art models, Llama-3-8B and InternLM2-20B on abstract pattern recognition and NLP tasks. Our results show that even a minimal ablation of induction heads leads to ICL performance decreases of up to ~32% for abstract pattern recognition tasks, bringing the performance close to random. For NLP tasks, this ablation substantially decreases the model’s ability to benefit from examples, bringing few-shot ICL performance close to that of zero-shot prompts. We further use attention knockout to disable specific induction patterns, and present fine-grained evidence for the role that the induction mechanism plays in ICL.

Document type Conference contribution
Language English
Published at https://doi.org/10.18653/v1/2025.findings-naacl.283
Other links https://github.com/JoyC177/Induction_Heads_ICL
Downloads
2025.findings-naacl.283 (Final published version)
Permalink to this page
Back