Alrehaili, SM orcid.org/0000-0002-4957-2478 and Atwell, E orcid.org/0000-0001-9395-3764 (2017) Extraction of Multi-Word Terms and Complex Terms from the Classical Arabic Text of the Quran. International Journal on Islamic Applications in Computer Science And Technology, 5 (3). pp. 15-27. ISSN 2289-4012
Abstract
The identification of domain-specific terms is a crucial step in many natural language processing applications. Term extraction is a process of obtaining a set of terms that represent the domain of a given text. The majority of term extraction research projects conducted for the Quran have used translated text instead of the original Classical Arabic text of the Quran. The extraction of terms from the original Arabic text rather than a translation may help in retrieving more relevant terms, due to the lack of Islamic equivalents of some Quran terms in other languages. This paper demonstrates a hybrid-based method for the acquisition of a list of domain-specific terms from the Arabic text of the Quran. The produced list of terms was validated using a common evaluation metric for ranked list; precision of up to 0.81 was achieved for the top 200 terms. We discuss the precision that was achieved, in the context of two existing datasets from previous research.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Editors: |
|
Keywords: | Quran terms; automatic term recognition; term extraction |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 21 Nov 2017 11:48 |
Last Modified: | 05 Aug 2019 09:55 |
Status: | Published |
Publisher: | Design for Scientific Renaissance |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:124245 |