Altammami, S orcid.org/0000-0002-3801-8236 and Atwell, E
(2022)
Challenging the Transformer-based models with a Classical Arabic dataset: Quran and Hadith.
In:
Proceedings of the Thirteenth Language Resources and Evaluation Conference.
13th Language Resources and Evaluation Conference, 20-25 Jun 2022, Marseille, France.
European Language Resources Association
, pp. 1462-1471.
Abstract
Transformer-based models showed near-perfect results on several downstream tasks. However, their performance on classical Arabic texts is largely unexplored. To fill this gap, we evaluate monolingual, bilingual, and multilingual state-of-the-art models to detect relatedness between the Quran (Muslim holy book) and the Hadith (Prophet Muhammed teachings), which are complex classical Arabic texts with underlying meanings that require deep human understanding. To do this, we carefully built a dataset of Quran-verse and Hadith-teaching pairs by consulting sources of reputable religious experts. This study presents the methodology of creating the dataset, which we make available on our repository, and discusses the models’ performance that calls for the imminent need to explore avenues for improving the quality of these models to capture the semantics in such complex, low-resource texts.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © European Language Resources Association (ELRA). This is an open access conference paper under the terms of the Creative Commons Attribution License (CC-BY-NC 4.0). |
Keywords: | Hadith, Quran, dataset, semantic similarity |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 31 May 2022 14:35 |
Last Modified: | 07 Aug 2023 10:25 |
Published Version: | https://aclanthology.org/2022.lrec-1.157 |
Status: | Published |
Publisher: | European Language Resources Association |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:187453 |