A Hybrid-based Term Extraction method on the Arabic text of the Qur'an

Alrehaili, SM orcid.org/0000-0002-4957-2478 and Atwell, E orcid.org/0000-0001-9395-3764 (2016) A Hybrid-based Term Extraction method on the Arabic text of the Qur'an. In: IMAN'2016 4th International Conference on Islamic Applications in Computer Science and Technologies. IMAN'2016 4th International Conference on Islamic Applications in Computer Science and Technologies, 20-22 Dec 2016, Khartoum, Sudan.

Abstract

The identification of relevant domain terms is a crucial step in numerous natural language processing applications. Term Extraction is a process of obtaining a set of terms that represent the domain of a given text. The majority of Term Extraction research projects conducted for the Qur’an have used translated text instead of the original text of the Qur’an. The extraction of terms from the original Arabic text rather than a translation may help in retrieving more relevant terms, due to the lack of Islamic equivalence of some Quranic terms in other languages. This paper demonstrates a hybridbased method for the acquisition of a list of domain-specific terms from the Arabic text of the Quran. The produced list of terms validated a common evaluation for ranked list; precision of up to 0.81 was achieved for the top 200 terms. We discussed the low precision that was achieved, in the context of evaluate the result against two existing datasets from previous research.

Metadata

Item Type:	Proceedings Paper
Authors/Creators:	Alrehaili, SM https://orcid.org/0000-0002-4957-2478 Atwell, E https://orcid.org/0000-0001-9395-3764
Keywords:	term extraction; automatic term recognition; Quranic terms
Dates:	Accepted: 1 October 2016 Published: 22 December 2016
Institution:	The University of Leeds
Academic Units:	The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds)
Depositing User:	Symplectic Publications
Date Deposited:	30 Mar 2017 09:22
Last Modified:	14 Aug 2019 15:03
Status:	Published
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:112947

CORE (COnnecting REpositories)

A Hybrid-based Term Extraction method on the Arabic text of the Qur'an

Abstract

Metadata

Download

Accepted Version

Export

Statistics