Alsaleh, AN, Atwell, E orcid.org/0000-0001-9395-3764 and Altahhan, A (2021) Quranic Verses Semantic Relatedness Using AraBERT. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop. The Sixth Arabic Natural Language Processing Workshop (WANLP 2021), 19 Apr 2021, Kyiv, Ukraine (Online). , pp. 185-190.
Abstract
Bidirectional Encoder Representations from Transformers (BERT) has gained popularity in recent years producing state-of-the-art performances across Natural Language Processing tasks. In this paper, we used AraBERT language model to classify pairs of verses provided by the QurSim dataset to either be semantically related or not. We have pre-processed The QurSim dataset and formed three datasets for comparisons. Also, we have used both versions of AraBERT, which are AraBERTv02 and AraBERTv2, to recognise which version performs the best with the given datasets. The best results was AraBERTv02 with 92% accuracy score using a dataset comprised of label ‘2’ and label '-1’, the latter was generated outside of QurSim dataset.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | This item is protected by copyright, all rights reserved. This is an open access article under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0) (https://creativecommons.org/licenses/by/4.0/) |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 25 Mar 2021 16:59 |
Last Modified: | 04 May 2021 14:48 |
Published Version: | https://www.aclweb.org/anthology/2021.wanlp-1.19 |
Status: | Published |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:172516 |