This is the latest version of this eprint.
Yamaguchi, A. orcid.org/0000-0001-8327-7598, Morishita, T., Villavicencio, A. et al. (1 more author) (2025) Adapting chat language models using only target unlabeled language data. Transactions on Machine Learning Research, 2025 (09). ISSN: 2835-8856
Abstract
Vocabulary expansion (VE) is the de-facto approach to language adaptation of large language models (LLMs) by adding new tokens and continuing pre-training on target data. While this is effective for base models trained on unlabeled data, it poses challenges for chat models trained to follow instructions through labeled conversation data. Directly adapting the latter with VE on target unlabeled data may result in forgetting chat abilities. While ideal, target chat data is often unavailable or costly to create for low-resource languages, and machine-translated alternatives are not always effective. To address this issue, previous work proposed using a base and chat model from the same family. This method first adapts the base LLM with VE on target unlabeled data and then converts it to a chat model by adding a chat vector (CV) derived from the weight difference between the source base and chat models. We propose ElChat, a new language adaptation method for chat LLMs that adapts a chat model directly on target unlabeled data, without a base model. It elicits chat abilities by injecting information from the source chat model. ElChat offers more robust and competitive target language and safety performance while achieving superior English, chat, and instruction-following abilities compared to CV.
Metadata
| Item Type: | Article |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | © 2025 The Authors. This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
| Dates: |
|
| Institution: | The University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
| Funding Information: | Funder Grant number Engineering and Physical Sciences Research Council 2894795 |
| Date Deposited: | 04 Nov 2025 14:44 |
| Last Modified: | 04 Nov 2025 15:09 |
| Published Version: | https://openreview.net/forum?id=6IdoIKowfe |
| Status: | Published |
| Publisher: | Journal of Machine Learning Research Inc. |
| Refereed: | Yes |
| Related URLs: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:233968 |
Available Versions of this Item
-
Adapting chat language models using only target unlabeled language data. (deposited 04 Nov 2025 14:34)
- Adapting chat language models using only target unlabeled language data. (deposited 04 Nov 2025 14:44) [Currently Displayed]

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)