Alosaimy, A and Atwell, E orcid.org/0000-0001-9395-3764 (2017) Tagging Classical Arabic Text using Available Morphological Analysers and Part of Speech Taggers. Journal for Language Technology and Computational Linguistics, 32 (1/2017). pp. 1-26. ISSN 2190-6858
Abstract
Focusing on Classical Arabic, this paper in its first part evaluates morphological analysers and POS taggers that are available freely for research purposes, are designed for Modern Standard Arabic (MSA) or Classical Arabic (CA), are able to analyse all forms of words, and have academic credibility. We list and compare supported features of each tool, and how they differ in the format of the output, segmentation, Part-of-Speech (POS) tags and morphological features. We demonstrate a sample output of each analyser against one CA fully-vowelized sentence. This evaluation serves as a guide in choosing the best tool that suits research needs. In the second part, we report the accuracy and coverage of tagging a set of classical Arabic vocabulary extracted from classical texts. The results show a drop in the accuracy and coverage and suggest an ensemble method might increase accuracy and coverage for classical Arabic.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | This is an open access article under the terms of the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0). |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 18 Jan 2018 12:54 |
Last Modified: | 10 Jan 2020 19:21 |
Published Version: | https://jlcl.org/allissues |
Status: | Published |
Publisher: | German Society for Computational Linguistics & Language Technology (GSCL) |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:126376 |