Alshammeri, M, Atwell, E orcid.org/0000-0001-9395-3764 and Alsalka, MA (2021) Detecting Semantic-based Similarity Between Verses of The Quran with Doc2vec. In: Procedia Computer Science. Fifth International Conference On AI In Computational Linguistics, 04-05 Jun 2021, Virtual. Elsevier , pp. 351-358.
Abstract
Semantic similarity analysis in natural language texts is getting great attention recently. Semantic analysis of the Quran is especially challenging because it is not simply factual but encodes subtle religious meanings. Investigating similarity and relatedness between the Quranic verses is a hot topic and can promote the acquisition of the underlying knowledge. Therefore, we use an NPL method to detect the semantic-based similarity between the verses of the Quran. The idea is to exploit the distributed representation of text, to learn an informative representation of the Quran’s passages. We map the Arabic Quranic verses to numerical vectors that encode the semantic properties of the text. We then measure similarity among those vectors. The performance of our model is judged through cosine similarity between our assigned semantic similarity scores and annotated textual similarity datasets. Our model scored 76% accuracy on detecting the similarity, and it can act as a basis for potential experiments and research.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2021 The Authors. Published by ELSEVIER B.V. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0) |
Keywords: | The Quran; Text similarity; Semantic-based similarity; NLP; Document embeddings; Doc2vec |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 10 Sep 2021 10:21 |
Last Modified: | 10 Sep 2021 10:21 |
Status: | Published |
Publisher: | Elsevier |
Identification Number: | 10.1016/j.procs.2021.05.104 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:178001 |