Revisiting the linearity in cross-lingual embedding mappings : from a perspective of word analogies

Abstract

Most cross-lingual embedding mapping algorithms assume the optimised transformation functions to be linear. Recent studies showed that on some occasions, learning a linear mapping does not work, indicating that the commonly-used assumption may fail. However, it still remains unclear under which conditions the linearity of cross-lingual embedding mappings holds. In this paper, we rigorously explain that the linearity assumption relies on the consistency of analogical relations encoded by multilingual embeddings. We did extensive experiments to validate this claim. Empirical results based on the analogy completion benchmark and the BLI task demonstrate a strong correlation between whether mappings capture analogical information and are linear.

Metadata

Item Type:	Article
Authors/Creators:	Peng, X. https://orcid.org/0000-0001-5787-9982 Lin, C. https://orcid.org/0000-0003-3454-2468 Stevenson, M. https://orcid.org/0000-0002-9483-6006 Li, C.
Copyright, Publisher and Additional Information:	© 2020 The Author(s). For reuse permissions, please contact the Author(s).
Dates:	Submitted: 2 April 2020
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	12 Aug 2021 13:43
Last Modified:	12 Aug 2021 16:31
Published Version:	https://arxiv.org/abs/2004.01079v1
Status:	Submitted
Related URLs:	arXiv URL
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:177033

CORE (COnnecting REpositories)

Revisiting the linearity in cross-lingual embedding mappings : from a perspective of word analogies

Abstract

Metadata

Download

Submitted Version

Export

Statistics