Sawalha, M, Brierley, C and Atwell, E (2014) Automatically generated, phonemic Arabic-IPA pronunciation tiers for the boundary annotated Qur’an dataset for machine learning (version 2.0). In: Proceedings of LRE-Rel 2: 2nd Workshop on Language Resource and Evaluation for Religious Texts, LREC 2014 post-conference workshop 31st May 2014, Reykjavik, Iceland. LRE-Rel 2: 2nd Workshop on Language Resource and Evaluation for Religious Texts, LREC 2014 post-conference workshop, 31st May 2014, Harpa Conference Center, Reykjavik, Iceland. The University of Leeds , 42 - 47.
Abstract
In this paper, we augment the Boundary Annotated Qur’an dataset published at LREC 2012 (Brierley et al 2012; Sawalha et al 2012a) with automatically generated phonemic transcriptions of Arabic words. We have developed and evaluated a comprehensive grapheme-phoneme mapping from Standard Arabic > IPA (Brierley et al under review), and implemented the mapping in Arabic transcription technology which achieves 100% accuracy as measured against two gold standards: one for Qur’anic or Classical Arabic, and one for Modern Standard Arabic (Sawalha et al [1]). Our mapping algorithm has also been used to generate a pronunciation guide for a subset of Qur’anic words with heightened prosody (Brierley et al 2014). This is funded research under the EPSRC " Working Together" theme.
Metadata
| Item Type: | Proceedings Paper | 
|---|---|
| Authors/Creators: | 
  | 
        
| Copyright, Publisher and Additional Information: | Sawalha, M, Brierley, C and Atwell, E (c) 2014, University of Leeds. Reproduced with permission from the copyright holders.  | 
        
| Keywords: | IPA phonemic transcription; SALMA Tagger; Arabic transcription technology | 
| Dates: | 
  | 
        
| Institution: | The University of Leeds | 
| Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) > Artificial Intelligence & Biological Systems (Leeds) | 
| Depositing User: | Symplectic Publications | 
| Date Deposited: | 27 Nov 2014 09:55 | 
| Last Modified: | 15 Jan 2018 23:13 | 
| Published Version: | http://www.lrec-conf.org/proceedings/lrec2014/work... | 
| Status: | Published | 
| Publisher: | The University of Leeds | 
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:81481 | 
 CORE (COnnecting REpositories)
 CORE (COnnecting REpositories)