Atwell, ES and Sawalha, M (2013) Comparing morphological tag-sets for Arabic and English. In: Hardie, A and Love, R, (eds.) Proceedings of the 7th International Corpus Linguistics Conference CL2013. 7th International Corpus Linguistics Conference CL2013, 22-26 Jul 2013, Lancaster, UK. UCREL , 261 - 264.
Abstract
Arabic morphological analysers and stemming algorithms have become a popular area of research. Many computational linguists have designed and developed algorithms to solve the problem of morphology and stemming. Each researcher proposed his own gold standard, testing methodology and accuracy measurements to test and compute the accuracy of his algorithm. Therefore, we cannot make comparisons between these algorithms. In this paper we have accomplished two tasks. First, we proposed four different fair and precise accuracy measurements and two 1000-word gold standards taken from the Holy Qur’an and from the Corpus of Contemporary Arabic. Second, we combined the results from the morphological analysers and stemming algorithms by voting after running them on the sample documents. The evaluation of the algorithms shows that Arabic morphology is still a challenge.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Editors: |
|
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) > Artificial Intelligence & Biological Systems (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 02 Dec 2014 12:14 |
Last Modified: | 19 Dec 2022 13:29 |
Published Version: | http://ucrel.lancs.ac.uk/cl2013/ |
Status: | Published |
Publisher: | UCREL |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:81414 |