Altammami, S orcid.org/0000-0002-3801-8236, Atwell, E orcid.org/0000-0001-9395-3764 and Alsalka, A (2020) The Arabic–English Parallel Corpus of Authentic Hadith. In: International Journal on Islamic Applications in Computer Science And Technology - IJASAT. International Conference on Islamic Applications in Computer Science and Technologies - IMAN 2019, 27-28 Dec 2019 Design For Scientific Renaissance , pp. 1-10.
Abstract
We present a bilingual parallel corpus of Islamic Hadith, which is the set of narratives reporting different aspects of the Prophet Muhammad's life. The Hadith collection is extracted from the six canonical Hadith books which possess unique linguistic features and patterns that are automatically extracted and annotated using a domain-specific tool for Hadith segmentation. In this article, we present the methodology of creating the corpus of 39,038 annotated Hadiths which will be freely available for the research community.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Keywords: | Hadith, Parallel Corpus, NLP, Language Resource |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 12 May 2020 10:53 |
Last Modified: | 28 Jan 2022 10:47 |
Published Version: | http://www.sign-ific-ance.co.uk/index.php/IJASAT/a... |
Status: | Published |
Publisher: | Design For Scientific Renaissance |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:160497 |