Cordeiro, S.R., Ramisch, C. and Villavicencio, A. orcid.org/0000-0002-3731-9168 (2016) UFRGS&LIF at SemEval-2016 task 10: Rule-based MWE identification and predominant-supersense tagging. In: Bethard, S., Carpuat, M., Cer, D., Jurgens, D., Nakov, P. and Torsten, Z., (eds.) Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016). 10th International Workshop on Semantic Evaluation (SemEval-2016), 16-17 Jun 2016, San Diego, California. Association for Computational Linguistics , pp. 910-917. ISBN 9781941643952
Abstract
This paper presents our approach towards the SemEval-2016 Task 10 - Detecting Minimal Semantic Units and their Meanings. Systems are expected to provide a representation of lexical semantics by (1) segmenting tokens into words and multiword units and (2) providing a supersense tag for segments that function as nouns or verbs. Our pipeline rule-based system uses no external resources and was implemented using the mwetoolkit. First, we extract and filter known MWEs from the training corpus. Second, we group input tokens of the test corpus based on this lexicon, with special treatment for non-contiguous expressions. Third, we use an MWE-aware predominant-sense heuristic for supersense tagging. We obtain an F-score of 51.48% for MWE identification and 49.98% for supersense tagging.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Editors: |
|
Copyright, Publisher and Additional Information: | © 2016 Association for Computational Linguistics. |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 21 Nov 2019 15:33 |
Last Modified: | 21 Nov 2019 16:33 |
Status: | Published |
Publisher: | Association for Computational Linguistics |
Refereed: | Yes |
Identification Number: | 10.18653/v1/S16-1140 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:153561 |