Garcia, M., Vieira, T.K., Scarton, C. et al. (2 more authors) (2021) Probing for idiomaticity in vector space models. In: Merlo, P., Tiedemann, J. and Tsarfaty, R., (eds.) Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics. 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2021), 19-23 Apr 2021, Virtual conference. Association for Computational Linguistics (ACL) , pp. 3551-3564.
Abstract
Contextualised word representation models have been successfully used for capturing different word usages and they may be an attractive alternative for representing idiomaticity in language. In this paper, we propose probing measures to assess if some of the expected linguistic properties of noun compounds, especially those related to idiomatic meanings, and their dependence on context and sensitivity to lexical choice, are readily available in some standard and widely used representations. For that, we constructed the Noun Compound Senses Dataset, which contains noun compounds and their paraphrases, in context neutral and context informative naturalistic sentences, in two languages: English and Portuguese. Results obtained using four types of probing measures with models like ELMo, BERT and some of its variants, indicate that idiomaticity is not yet accurately represented by contextualised models.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Editors: |
|
Copyright, Publisher and Additional Information: | © 2021 The Authors. Made available under a Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/). |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Funding Information: | Funder Grant number Engineering and Physical Science Research Council EP/T02450X/1 The Royal Society NAF\R2\202209 |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 12 Feb 2021 09:31 |
Last Modified: | 19 Dec 2022 13:50 |
Published Version: | https://www.aclweb.org/anthology/2021.eacl-main.31... |
Status: | Published |
Publisher: | Association for Computational Linguistics (ACL) |
Refereed: | Yes |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:170754 |