An empirical study on cross-lingual vocabulary adaptation for efficient language model inference

This is the latest version of this eprint.

Yamaguchi, A. orcid.org/0000-0001-8327-7598, Villavicencio, A. and Aletras, N. (2024) An empirical study on cross-lingual vocabulary adaptation for efficient language model inference. In: Al-Onaizan, Y., Bansal, M. and Chen, Y.-N., (eds.) Findings of the Association for Computational Linguistics: EMNLP 2024. The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024), 12-16 Nov 2024, Miami, Florida, USA. Association for Computational Linguistics , pp. 6760-6785. ISBN 9798891761681

Abstract

The development of state-of-the-art generative large language models (LLMs) disproportionately relies on English-centric tokenizers, vocabulary and pre-training data. Despite the fact that some LLMs have multilingual capabilities, recent studies have shown that their inference efficiency deteriorates when generating text in languages other than English. This results in increased inference time and costs. Cross-lingual vocabulary adaptation (CVA) methods have been proposed for adapting models to a target language aiming to improve downstream performance. However, the effectiveness of these methods on increasing inference efficiency of generative LLMs has yet to be explored. In this paper, we perform an empirical study of five CVA methods on four generative LLMs (including monolingual and multilingual models) across four typologically-diverse languages and four natural language understanding tasks. We find that CVA substantially contributes to LLM inference speedups of up to 271.5%. We also show that adapting LLMs that have been pre-trained on more balanced multilingual data results in downstream performance comparable to the original models.

Metadata

Item Type:	Proceedings Paper
Authors/Creators:	Yamaguchi, A. https://orcid.org/0000-0001-8327-7598 Villavicencio, A. Aletras, N.
Editors:	Al-Onaizan, Y. Bansal, M. Chen, Y.-N.
Copyright, Publisher and Additional Information:	© 2024 Association for Computational Linguistics (ACL). Licensed on a Creative Commons Attribution 4.0 International License. (https://creativecommons.org/licenses/by/4.0/)
Dates:	Accepted: 20 September 2024 Published (online): 12 November 2024 Published: 12 November 2024
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	23 Oct 2024 15:07
Last Modified:	13 Nov 2024 14:32
Published Version:	https://aclanthology.org/2024.findings-emnlp.396
Status:	Published
Publisher:	Association for Computational Linguistics
Refereed:	Yes
Related URLs:	Conference arXiv URL
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:218822

Available Versions of this Item

An empirical study on cross-lingual vocabulary adaptation for efficient language model inference. (deposited 23 Oct 2024 14:46)
- An empirical study on cross-lingual vocabulary adaptation for efficient language model inference. (deposited 23 Oct 2024 15:07) [Currently Displayed]

Download

Published Version

Filename: 2024.findings-emnlp.396.pdf

Licence: CC-BY 4.0

CLICK TO DOWNLOAD

[thumbnail of 2024.findings-emnlp.396.pdf]

CORE (COnnecting REpositories)