When Models Reason in Your Language: Controlling Thinking Language Comes at the Cost of Accuracy
| Authors |
|
|---|---|
| Publication date | 2025 |
| Host editors |
|
| Book title | The 2025 Conference on Empirical Methods in Natural Language Processing : Findings of EMNLP 2025 |
| Book subtitle | EMNLP 2025 : November 4-9, 2025 |
| ISBN (electronic) |
|
| Event | 30th Conference on Empirical Methods in Natural Language Processing, EMNLP 2025 |
| Pages (from-to) | 20279–20296 |
| Publisher | Kerrville, TX: Association for Computational Linguistics |
| Organisations |
|
| Abstract |
Recent Large Reasoning Models (LRMs) with thinking traces have shown strong performance on English reasoning tasks. However, the extent to which LRMs can think in other languages is less studied. This is as important as answer accuracy for real-world applications since users may find the thinking trace useful for oversight only if expressed in their languages. In this work, we comprehensively evaluate two leading families of LRMs on our established benchmark XReasoning. Surprisingly, even the most advanced models often revert to English or produce fragmented reasoning in other languages, revealing a substantial gap in the capability of thinking in non-English languages. Promoting models to reason in the user’s language via prompt hacking enhances readability and oversight. This could gain user trust, but reduces answer accuracy, exposing an important trade-off. We further demonstrate that targeted post-training, even with just 100 instances, can mitigate this language mismatch, although accuracy is still degraded. Our results reveal the limited multilingual reasoning capabilities of current LRMs and suggest directions for future research. All code and datasets are released at https://github.com/Betswish/mCoT-XReasoning.
|
| Document type | Conference contribution |
| Note | With checklist |
| Language | English |
| Published at | https://aclanthology.org/2025.findings-emnlp.1103/ |
| Other links | https://github.com/Betswish/mCoT-XReasoning |
| Downloads |
2025.findings-emnlp.1103
(Final published version)
|
| Supplementary materials | |
| Permalink to this page | |