Alrehaili, SM orcid.org/0000-0002-4957-2478 and Atwell, E orcid.org/0000-0001-9395-3764 (2016) A Hybrid-based Term Extraction method on the Arabic text of the Qur'an. In: IMAN'2016 4th International Conference on Islamic Applications in Computer Science and Technologies. IMAN'2016 4th International Conference on Islamic Applications in Computer Science and Technologies, 20-22 Dec 2016, Khartoum, Sudan.
Abstract
The identification of relevant domain terms is a crucial step in numerous natural language processing applications. Term Extraction is a process of obtaining a set of terms that represent the domain of a given text. The majority of Term Extraction research projects conducted for the Qur’an have used translated text instead of the original text of the Qur’an. The extraction of terms from the original Arabic text rather than a translation may help in retrieving more relevant terms, due to the lack of Islamic equivalence of some Quranic terms in other languages. This paper demonstrates a hybridbased method for the acquisition of a list of domain-specific terms from the Arabic text of the Quran. The produced list of terms validated a common evaluation for ranked list; precision of up to 0.81 was achieved for the top 200 terms. We discussed the low precision that was achieved, in the context of evaluate the result against two existing datasets from previous research.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Keywords: | term extraction; automatic term recognition; Quranic terms |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 30 Mar 2017 09:22 |
Last Modified: | 14 Aug 2019 15:03 |
Status: | Published |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:112947 |