Ezeani, I. orcid.org/0000-0001-8286-9997, Hepple, M. orcid.org/0000-0003-1488-257X, Onyenwe, I. et al. (1 more author) (2018) Multi-task projected embedding for Igbo. In: Sojka, P., Horák, A., Kopeček, I. and Pala, K., (eds.) Text, Speech, and Dialogue : 21st International Conference, Proceedings. 21st International Conference on Text, Speech, and Dialogue, 11-14 Sep 2018, Brno, Czech Republic. Springer , pp. 285-294. ISBN 9783030007935
Abstract
NLP research on low resource African languages is often impeded by the unavailability of basic resources: tools, techniques, annotated corpora, and datasets. Besides the lack of funding for the manual development of these resources, building from scratch will amount to the reinvention of the wheel. Therefore, adapting existing techniques and models from well-resourced languages is often an attractive option. One of the most generally applied NLP models is word embeddings. Embedding models often require large amounts of data to train which are not available for most African languages. In this work, we adopt an alignment based projection method to transfer trained English embeddings to the Igbo language. Various English embedding models were projected and evaluated on the odd-word, analogy and word-similarity tasks intrinsically, and also on the diacritic restoration task. Our results show that the projected embeddings performed very well across these tasks.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Editors: |
|
Copyright, Publisher and Additional Information: | © 2018 Springer Nature. This is an author-produced version of a paper subsequently published in Text, Speech, and Dialogue. Uploaded in accordance with the publisher's self-archiving policy. |
Keywords: | Low-resource; Igbo; Diacritics; Embedding models; Transfer learning |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 24 Sep 2019 08:08 |
Last Modified: | 03 Oct 2019 14:20 |
Status: | Published |
Publisher: | Springer |
Refereed: | Yes |
Identification Number: | 10.1007/978-3-030-00794-2_31 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:151187 |