Newman-Griffis, D. orcid.org/0000-0002-0473-4226, Lai, A. and Fosler-Lussier, E.
(2017)
Insights into analogy completion from the biomedical domain.
In: Bretonnel Cohen, K., Demner-Fushman, D., Ananiadou, S. and Tsujii, J., (eds.)
Proceedings of the 16th BioNLP Workshop (BioNLP 2017).
The 16th Biomedical Natural Language Processing Workshop (BioNLP 2017), 04 Aug 2017, Vancouver, Canada.
Association for Computational Linguistics
, pp. 19-28.
ISBN 9781945626593
Abstract
Analogy completion has been a popular task in recent years for evaluating the semantic properties of word embeddings, but the standard methodology makes a number of assumptions about analogies that do not always hold, either in recent benchmark datasets or when expanding into other domains. Through an analysis of analogies in the biomedical domain, we identify three assumptions: that of a Single Answer for any given analogy, that the pairs involved describe the Same Relationship, and that each pair is Informative with respect to the other. We propose modifying the standard methodology to relax these assumptions by allowing for multiple correct answers, reporting MAP and MRR in addition to accuracy, and using multiple example pairs. We further present BMASS, a novel dataset for evaluating linguistic regularities in biomedical embeddings, and demonstrate that the relationships described in the dataset pose significant semantic challenges to current word embedding methods.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Editors: |
|
Copyright, Publisher and Additional Information: | © 2017 Association for Computational Linguistics. Licensed on a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/). |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 17 Feb 2023 11:49 |
Last Modified: | 18 Feb 2023 01:16 |
Status: | Published |
Publisher: | Association for Computational Linguistics |
Refereed: | Yes |
Identification Number: | 10.18653/v1/w17-2303 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:196494 |
Download
Filename: 2017_bionlp.pdf
Licence: CC-BY 4.0