Al-Salman, A, Alrabiah, MS and Atwell, E orcid.org/0000-0001-9395-3764 (2013) KSUCCA the Corner Stone for Studying the Meanings of the Holy Quran Words in the Light of Distributional Semantic Models. In: 2013 Taibah University International Conference on Advances in Information Technology for the Holy Quran and Its Sciences. 2013 Taibah University International Conference on Advances in Information Technology for the Holy Quran and Its Sciences, 22-25 Dec 2013, Al-Madinah, Saudi Arabia. IEEE , pp. 652-659. ISBN 978-1-4799-2822-4
Abstract
Distributional semantic models are considered one of the empiricist approaches to study language structure and design. Its mainly based on building semantic models of words' meanings using statistical analysis of their distribution in very large corpora. In this paper, we present the Kind Saud University Corpus of Classical Arabic (KSUCCA), which is considered the corner stone for studying the distributional lexical semantic models of the Holy Quran words. It is a free, +50 million words corpus containing texts dating back to the period from pre- Islamic era until the fourth Hijri century. We will describe the design guidelines for KSUCCA including its aim, balance, representation, text sampling, copy right, character encoding and files organization. We will also demonstrate some preliminary experiments we carried out on KSUCCA and the results we got.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Keywords: | The Holy Quran; Corpus; Classical Arabic; KSUCCA; Distributional Semantic Models; Computational Linguistics |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 09 Mar 2017 13:14 |
Last Modified: | 09 Mar 2017 13:14 |
Published Version: | https://doi.org/10.1109/NOORIC.2013.103 |
Status: | Published |
Publisher: | IEEE |
Identification Number: | 10.1109/NOORIC.2013.103 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:101736 |