
There is a more recent version of this eprint available. Click here to view it.
Yamaguchi, A. orcid.org/0000-0001-8327-7598, Morishita, T., Villavicencio, A. orcid.org/0000-0002-3731-9168 et al. (1 more author) (2025) Mitigating catastrophic forgetting in target language adaptation of LLMs via Source-Shielded Updates. [Preprint - arXiv] (Submitted)
Abstract
Expanding the linguistic diversity of instruct large language models (LLMs) is crucial for global accessibility but is often hindered by the reliance on costly specialized target language labeled data and catastrophic forgetting during adaptation. We tackle this challenge under a realistic, low-resource constraint: adapting instruct LLMs using only unlabeled target language data. We introduce Source-Shielded Updates (SSU), a selective parameter update strategy that proactively preserves source knowledge. Using a small set of source data and a parameter importance scoring method, SSU identifies parameters critical to maintaining source abilities. It then applies a column-wise freezing strategy to protect these parameters before adaptation. Experiments across five typologically diverse languages and 7B and 13B models demonstrate that SSU successfully mitigates catastrophic forgetting. It reduces performance degradation on monolingual source tasks to just 3.4% (7B) and 2.8% (13B) on average, a stark contrast to the 20.3% and 22.3% from full fine-tuning. SSU also achieves target-language performance highly competitive with full fine-tuning, outperforming it on all benchmarks for 7B models and the majority for 13B models.
Metadata
| Item Type: | Preprint |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | © 2025 The Author(s). For reuse permissions, please contact the Author(s). |
| Keywords: | Information and Computing Sciences; Language, Communication and Culture; Linguistics |
| Dates: |
|
| Institution: | The University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
| Funding Information: | Funder Grant number Engineering and Physical Sciences Research Council 2894795 |
| Date Deposited: | 06 May 2026 13:43 |
| Last Modified: | 06 May 2026 13:43 |
| Status: | Submitted |
| Identification Number: | 10.48550/arxiv.2512.04844 |
| Related URLs: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:240754 |
Available Versions of this Item
- Mitigating catastrophic forgetting in target language adaptation of LLMs via Source-Shielded Updates. (deposited 06 May 2026 13:43) [Currently Displayed]
Download
Filename: 2512.04844v1.pdf

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)