Alfaifi, A and Atwell, ES (2014) An evaluation of the Arabic error tagset v2. In: Proceedings of the AACL 2014 - The American Association for Corpus Linguistics conference. AACL 2014 - The American Association for Corpus Linguistics conference, 26-28 Sep 2014, Flagstaff, USA. The American Association for Corpus Linguistics
Abstract
A survey of the literature shows that annotating errors of Arabic learners has not received much attention, and there is a need for a practical error tagset which can be used for Arabic learner corpora. This type of tagset is used in such corpora for several purposes, e.g., Contrastive Interlanguage Analysis (CIA), learner dictionary making, Second Language Acquisition, designing pedagogical materials, etc. This paper evaluates the second version of a two-level error tagset developed for annotating the Arabic Learner Corpus (ALC). It includes six broad classes, subdivided into more specific error types. The paper shows the tagset, and an example of the annotation method used for tagging the ALC. The inter-annotator agreement using the current revised version of the error tagset was higher compared to the first version (Alfaifi et al., 2013). Four factors assisted in reaching this level of accuracy: (1) the tagset was reviewed by two experts in Arabic language, (2) the annotators were given texts with errors already identified, so the ir task was to classify and mark the appropriate tag on each error, (3) the annotators were trained during the experiment, (4) an error tagging manual was created which explains all error types in the tagset with rules and examples of how to tag learners' errors. Two lists of varied sentences, 100 in each, were tagged for errors by three annotators; after tagging the first list they discussed their work to provide them with suitable training, and this allowed us to distinguish the value of the training among the other factors.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | Alfaifi, A and Atwell, ES (c) 2014, University of Leeds. Reproduced with permission from the copyright holders. |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) > Artificial Intelligence & Biological Systems (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 27 Nov 2014 10:14 |
Last Modified: | 19 Dec 2022 13:29 |
Published Version: | http://www.learnercorpusassociation.org/event/1247... |
Status: | Published |
Publisher: | The American Association for Corpus Linguistics |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:81592 |