White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

Comparative evaluation of Arabic language morphological analysers and stemmers

Sawalha, M. and Atwell, E.S. (2008) Comparative evaluation of Arabic language morphological analysers and stemmers. In: Proceedings of COLING 2008 22nd International Conference on Comptational Linguistics (Poster Volume)). COLING 2008, 18 Aug 2008 - 22 Aug 2008, Manchester. Coling 2008 Organizing Committee , Manchester , 107 - 110 .

Full text available as:
[img] Text
sawalha08coling_front.pdf

Download (133Kb)

Abstract

Arabic morphological analysers and stemming algorithms have become a popular area of research. Many computational linguists have designed and developed algorithms to solve the problem of morphology and stemming. Each researcher proposed his own gold standard, testing methodology and accuracy measurements to test and compute the accuracy of his algorithm. Therefore, we cannot make comparisons between these algorithms. In this paper we have accomplished two tasks. First, we proposed four different fair and precise accuracy measurements and two 1000-word gold standards taken from the Holy Qur’an and from the Corpus of Contemporary Arabic. Second, we combined the results from the morphological analysers and stemming algorithms by voting after running them on the sample documents. The evaluation of the algorithms shows that Arabic morphology is still a challenge.

Item Type: Proceedings Paper
Institution: The University of Leeds
Academic Units: The University of Leeds > Faculty of Engineering (Leeds) > School of Computing (Leeds)
Depositing User: Symplectic Publications
Date Deposited: 16 Nov 2010 10:54
Last Modified: 15 Sep 2014 01:23
Published Version: http://www.aclweb.org/anthology/C/C08/C08-2027.pdf
Status: Published
Publisher: Coling 2008 Organizing Committee
URI: http://eprints.whiterose.ac.uk/id/eprint/42635

Actions (repository staff only: login required)