Yang, X., Mu, Y., Bontcheva, K. orcid.org/0000-0001-6152-9600 et al. (1 more author) (2024) Optimising LLM-driven machine translation with context-aware sliding windows. In: Haddow, B., Kocmi, T., Koehn, P. and Monz, C., (eds.) Proceedings of the Ninth Conference on Machine Translation. Minth Conference on Machine TRanslation (WMT24), 15-16 Nov 2024, Miami, Florida, USA. Association for Computational Linguistics , pp. 1004-1010. ISBN 979-8-89176-179-7
Abstract
This paper describes SheffieldGATE’s submission to WMT 2024 Chat Shared Translation Task. We participate in three language pairs: English-German, English-Dutch, and English-Portuguese (Brazil). In this work, we introduce a context-aware sliding window decoding method to track dependencies between chat messages. We fine-tune a large pre-trained language model based on the training data provided by the shared task Our experiments (i) compare the model performance between multilingual and bilingual fine-tuning and (ii) assess the impact of different window sizes. Our experimental results demonstrate that utilising contextual information yields superior performance in document-level translation compared to translating documents as isolated text segments, and that models fine-tuned with multilingual data perform better than those fine-tuned with bilingual data.
Metadata
| Item Type: | Proceedings Paper |
|---|---|
| Authors/Creators: |
|
| Editors: |
|
| Copyright, Publisher and Additional Information: | © 2024 Association for Computational Linguistics. This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
| Dates: |
|
| Institution: | The University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
| Depositing User: | Symplectic Sheffield |
| Date Deposited: | 13 Feb 2025 14:03 |
| Last Modified: | 13 Feb 2025 15:12 |
| Status: | Published |
| Publisher: | Association for Computational Linguistics |
| Refereed: | Yes |
| Identification Number: | 10.18653/v1/2024.wmt-1.101 |
| Related URLs: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:223235 |

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)