Sawalha, M, Brierley, C and Atwell, E (2014) Automatically generated, phonemic Arabic-IPA pronunciation tiers for the boundary annotated Qur’an dataset for machine learning (version 2.0). In: Proceedings of LRE-Rel 2: 2nd Workshop on Language Resource and Evaluation for Religious Texts, LREC 2014 post-conference workshop 31st May 2014, Reykjavik, Iceland. LRE-Rel 2: 2nd Workshop on Language Resource and Evaluation for Religious Texts, LREC 2014 post-conference workshop, 31st May 2014, Harpa Conference Center, Reykjavik, Iceland. The University of Leeds , 42 - 47.
Abstract
In this paper, we augment the Boundary Annotated Qur’an dataset published at LREC 2012 (Brierley et al 2012; Sawalha et al 2012a) with automatically generated phonemic transcriptions of Arabic words. We have developed and evaluated a comprehensive grapheme-phoneme mapping from Standard Arabic > IPA (Brierley et al under review), and implemented the mapping in Arabic transcription technology which achieves 100% accuracy as measured against two gold standards: one for Qur’anic or Classical Arabic, and one for Modern Standard Arabic (Sawalha et al [1]). Our mapping algorithm has also been used to generate a pronunciation guide for a subset of Qur’anic words with heightened prosody (Brierley et al 2014). This is funded research under the EPSRC " Working Together" theme.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | Sawalha, M, Brierley, C and Atwell, E (c) 2014, University of Leeds. Reproduced with permission from the copyright holders. |
Keywords: | IPA phonemic transcription; SALMA Tagger; Arabic transcription technology |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) > Artificial Intelligence & Biological Systems (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 27 Nov 2014 09:55 |
Last Modified: | 15 Jan 2018 23:13 |
Published Version: | http://www.lrec-conf.org/proceedings/lrec2014/work... |
Status: | Published |
Publisher: | The University of Leeds |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:81481 |