Sivakumar, J.A. and Moosavi, N.S. orcid.org/0000-0002-8332-307X (2025) How to leverage digit embeddings to represent numbers? In: Rambow, O., Wanner, L., Apidianaki, M., Al-Khalifa, H., Di Eugenio,, B. and Schockaert, S., (eds.) Proceedings of the 31st International Conference on Computational Linguistics. The 31st International Conference on Computational Linguistics (COLING 2025), 19-24 Jan 2025, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics (ACL) , pp. 7685-7697. ISBN 9798891761964
Abstract
Within numerical reasoning, understanding numbers themselves is still a challenge for existing language models. Simple generalisations, such as solving 100+200 instead of 1+2, can substantially affect model performance (Sivakumar and Moosavi, 2023). Among various techniques, character-level embeddings of numbers have emerged as a promising approach to improve number representation. However, this method has limitations as it leaves the task of aggregating digit representations to the model, which lacks direct supervision for this process. In this paper, we explore the use of mathematical priors to compute aggregated digit embeddings and explicitly incorporate these aggregates into transformer models. This can be achieved either by adding a special token to the input embeddings or by introducing an additional loss function to enhance correct predictions. We evaluate the effectiveness of incorporating this explicit aggregation, analysing its strengths and shortcomings, and discuss future directions to better benefit from this approach. Our methods, while simple, are compatible with any pretrained model, easy to implement, and have been made publicly available.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Editors: |
|
Copyright, Publisher and Additional Information: | © 2025 The Association for Computational Linguistics. This paper is made available under a Creative Commons Attribution 4.0 International License. (https://creativecommons.org/licenses/by/4.0/) |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 21 May 2025 15:20 |
Last Modified: | 21 May 2025 15:20 |
Published Version: | https://aclanthology.org/2025.coling-main.514/ |
Status: | Published |
Publisher: | Association for Computational Linguistics (ACL) |
Refereed: | Yes |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:226970 |