Tarmom, T orcid.org/0000-0002-2834-461X, Atwell, E orcid.org/0000-0001-9395-3764 and Alsalka, MA orcid.org/0000-0003-3335-1918 (2020) Automatic Hadith Segmentation using PPM Compression. In: Bhattacharyya, P, Sharma, DM and Sangal, R, (eds.) Proceedings of the 17th International Conference on Natural Language Processing (ICON). 17th International Conference on Natural Language Processing (ICON), 18-21 Dec 2020, Patna, India. NLP Association of India (NLPAI) , pp. 22-29.
Abstract
In this paper we explore the use of Prediction by partial matching (PPM) compression based to segment Hadith into its two main components (Isnad and Matan). The experiments utilized the PPMD variant of the PPM, showing that PPMD is effective in Hadith segmentation. It was also tested on Hadith corpora of different structures. In the first experiment we used the non- authentic Hadith (NAH) corpus for train- ing models and testing, and in the second experiment we used the NAH corpus for training models and the Leeds University and King Saud University (LK) Hadith corpus for testing PPMD segmenter. PPMD of order 7 achieved an accuracy of 92.76% and 90.10% in the first and second experiments, respectively.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Editors: |
|
Copyright, Publisher and Additional Information: | © 2020 NLP Association of India (NLPAI). This is an open access conference paper under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited. |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 13 Apr 2021 08:33 |
Last Modified: | 04 Dec 2023 16:29 |
Published Version: | https://aclanthology.org/2020.icon-main.4 |
Status: | Published |
Publisher: | NLP Association of India (NLPAI) |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:172041 |