Pickard, T (2020) Comparing word2vec and GloVe for Automatic Measurement of MWE Compositionality. In: Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons. COLING 2020, The 28th International Conference on Computational Linguistics, 08-13 Dec 2020, Online.
Abstract
This paper explores the use of word2vec and GloVe embeddings for unsupervised measurement of the semantic compositionality of MWE candidates. Through comparison with several human-annotated reference sets, we find word2vec to be substantively superior to GloVe for this task. We also find Simple English Wikipedia to be a poor-quality resource for compositionality assessment, but demonstrate that a sample of 10% of sentences in the English Wikipedia can provide a conveniently tractable corpus with only moderate reduction in the quality of outputs.
Metadata
| Item Type: | Proceedings Paper |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | This item is protected by copyright. This work is licensed under a Creative Commons Attribution 4.0 International Licence. Licence details: http://creativecommons.org/licenses/by/4.0/. |
| Dates: |
|
| Institution: | The University of Leeds |
| Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
| Depositing User: | Symplectic Publications |
| Date Deposited: | 05 Mar 2021 13:41 |
| Last Modified: | 05 Mar 2021 13:41 |
| Status: | Published |
| Related URLs: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:171824 |
CORE (COnnecting REpositories)
CORE (COnnecting REpositories)