Aker, A., Paramita, M.L., Barker, E. et al. (1 more author) (2014) Bootstrapping Term Extractors for Multiple Languages. In: Proceedings of the 9th LREC Conference. LREC 2014, Ninth International Conference on Language Resources and Evaluation, 26-31 May 2014, Reykjavik, Iceland. , pp. 483-489.
Abstract
Terminology extraction resources are needed for a wide range of human language technology applications, including knowledge management, information extraction, semantic search, cross-language information retrieval and automatic and assisted translation. We report a low cost method for creating terminology extraction resources for 21 non-English EU languages. Using parallel corpora and a projection method, we create a General POS Tagger for these languages. We also investigate the use of EuroVoc terms and Wikipedia to automatically create a term grammar for each language. Our results show that these automatically generated resources can assist the term extraction process, achieving similar performance to manually generated resources. All POS tagger and term grammar resources resulting from this work are freely available for download.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | The LREC 2014 Proceedings are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/) |
Keywords: | POS Tagger; term grammar; EU languages |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Funding Information: | Funder Grant number EUROPEAN COMMISSION - FP6/FP7 TAAS - 296312 |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 07 Mar 2016 10:58 |
Last Modified: | 07 Mar 2016 10:58 |
Published Version: | http://www.lrec-conf.org/proceedings/lrec2014/pdf/... |
Status: | Published |
Refereed: | Yes |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:94337 |