Predicting the compositionality of nominal compounds: Giving word embeddings a hard time

Cordeiro, S., Ramisch, C., Idiart, M. et al. (1 more author) (2016) Predicting the compositionality of nominal compounds: Giving word embeddings a hard time. In: Erk, K. and Smith, N.A., (eds.) Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Long Papers). 54th Annual Meeting of the Association for Computational Linguistics, 07-12 Aug 2016, Berlin, Germany. Vol. 1. Association for Computational Linguistics, pp. 1986-1997. ISBN: 9781510827585.

Abstract

Distributional semantic models (DSMs) are often evaluated on artificial similarity datasets containing single words or fully compositional phrases. We present a large-scale multilingual evaluation of DSMs for predicting the degree of semantic compositionality of nominal compounds on 4 datasets for English and French. We build a total of 816 DSMs and perform 2,856 evaluations using word2vec, GloVe, and PPMI-based models. In addition to the DSMs, we compare the impact of different parameters, such as level of corpus preprocessing, context window size and number of dimensions. The results obtained have a high correlation with human judgments, being comparable to or outperforming the state of the art for some datasets (Spearman's ρ=.82 for the Reddy dataset).

Metadata

Item Type:	Proceedings Paper
Authors/Creators:	Cordeiro, S. Ramisch, C. Idiart, M. Villavicencio, A. https://orcid.org/0000-0002-3731-9168
Editors:	Erk, K. Smith, N.A.
Copyright, Publisher and Additional Information:	© 2016 The Association for Computational Linguistics
Dates:	Published: August 2016
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield)
Date Deposited:	21 Nov 2019 15:23
Last Modified:	21 Nov 2019 16:30
Status:	Published
Publisher:	Association for Computational Linguistics
Refereed:	Yes
Identification Number:	10.18653/v1/P16-1187
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:153562

CORE (COnnecting REpositories)

Predicting the compositionality of nominal compounds: Giving word embeddings a hard time

Abstract

Metadata

Download

External copy

Export

Statistics