Mi, M., Villavicencio, A. orcid.org/0000-0002-3731-9168 and Moosavi, N.S. orcid.org/0000-0002-8332-307X (2025) Rolling the DICE on idiomaticity: how LLMs fail to grasp context. In: Che, W., Nabende, J., Shutova, E. and Pilehvar, M.T., (eds.) Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 63rd Annual Meeting of the Association for Computational Linguistics, 27 Jul - 01 Aug 2025, Vienna, Austria. Association for Computational Linguistics, pp. 7314-7332. ISSN: 0736-587X.
Abstract
Human processing of idioms heavily depends on interpreting the surrounding context in which they appear. While large language models (LLMs) have achieved impressive performance on idiomaticity detection benchmarks, this success may be driven by reasoning shortcuts present in existing datasets. To address this, we introduce a novel, controlled contrastive dataset (DICE) specifically designed to assess whether LLMs can effectively leverage context to disambiguate idiomatic meanings. Furthermore, we investigate the influence of collocational frequency and sentence probability'proxies for human processing known to affect idiom resolution'on model performance. Our results show that LLMs frequently fail to resolve idiomaticity when it depends on contextual understanding, and they perform better on sentences deemed more likely by the model. Additionally, idiom frequency influences performance but does not guarantee accurate interpretation. Our findings emphasize the limitations of current models in grasping contextual meaning and highlight the need for more context-sensitive evaluation.
Metadata
| Item Type: | Proceedings Paper |
|---|---|
| Authors/Creators: |
|
| Editors: |
|
| Copyright, Publisher and Additional Information: | © 2025 The Authors. This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
| Dates: |
|
| Institution: | The University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
| Date Deposited: | 06 Feb 2026 12:33 |
| Last Modified: | 06 Feb 2026 12:33 |
| Status: | Published |
| Publisher: | Association for Computational Linguistics |
| Refereed: | Yes |
| Identification Number: | 10.18653/v1/2025.acl-long.362 |
| Related URLs: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:236942 |
Download
Filename: 2025.acl-long.362.pdf
Licence: CC-BY 4.0

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)