Sharaf, A-BM and Atwell, ES orcid.org/0000-0001-9395-3764 (2012) QurAna: Corpus of the Quran annotated with Pronominal Anaphora. In: Chair, NCC, Choukri, K, Declerck, T, an, MUUD, Maegaard, B, Mariani, J, Odijk, J and Piperidis, S, (eds.) Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12). LREC 2012, Eighth International Conference on Language Resources and Evaluation, 21-27 May 2017, Istanbul, Turkey. European Language Resources Association (ELRA) , Istanbul, Turkey , pp. 130-137. ISBN 978-2-9517408-7-7
Abstract
This paper presents QurAna: a large corpus created from the original Quranic text, where personal pronouns are tagged with their antecedence. These antecedents are maintained as an ontological list of concepts, which has proved helpful for information retrieval tasks. QurAna is characterized by: (a) comparatively large number of pronouns tagged with antecedent information (over 24,500 pronouns), and (b) maintenance of an ontological concept list out of these antecedents. We have shown useful application s of this corpus. This corpus is the first of its kind cov ering Classical Arabic text, and could be used for interesting applications for Modern Standard Arabic as well. This corpus will enable researchers to obtain empirical patterns and rules to build new anaphora resolution approaches. Also, this corpus can be used to train, optimize and evaluate existing approaches.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Editors: |
|
Keywords: | Anaphora; Quran; Corpus |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 08 Sep 2017 14:31 |
Last Modified: | 10 Sep 2017 06:59 |
Published Version: | http://www.lrec-conf.org/proceedings/lrec2012/pdf/... |
Status: | Published |
Publisher: | European Language Resources Association (ELRA) |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:101163 |