Alahmari, S. orcid.org/0009-0002-6490-3295, Atwell, E. orcid.org/0000-0001-9395-3764 and Alsalka, M.A. orcid.org/0000-0003-3335-1918 (2024) Saudi Arabic Multi-dialects Identification in Social Media Texts. In: Intelligent Computing: Proceedings of the 2024 Computing Conference, Volume 1. Computing Conference 2024 (SAI 2024), 11-12 Jul 2024, London, UK. Lecture Notes in Networks and Systems, 1016. Springer Nature, Cham, Switzerland, pp. 209-217. ISBN: 9783031622809. ISSN: 2367-3370. EISSN: 2367-3389.
Abstract
ChatGPT is a state-of-the-art, robust artificial intelligence language model that can be used in a wide range of Natural Language Processing (NLP) applications. These applications include, but are not restricted to, text classification, text generation, sentiment analysis, and question answering. ChatGPT is primarily aimed at generating English text, but it has also been found to process other languages, such as Arabic language. This paper shows the usage of the ChatGPT model in the task of dialect identification, specifically for Saudi Arabic dialects. Five different Saudi Arabic dialects, namely Hijazi (spoken by people in western regions), Najdi (spoken by people in central regions), Eastern (spoken by people in eastern regions), Southern (spoken by people in southern regions), and Northern (spoken by people in northern regions) were selected in this study. The experimental results demonstrate that ChatGPT achieved an overall accuracy of 0.42, which is higher than identification with a Support Vector Machine (SVM) that gave 0.33 in our sample data-set.
Metadata
| Item Type: | Proceedings Paper |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | © 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG. This version of the article has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use (https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms), but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/10.1007/978-3-031-62281-6_15. |
| Keywords: | ChatGPT; Saudi Arabic dialects; Dialects identification; Arabic NLP; Social media |
| Dates: |
|
| Institution: | The University of Leeds |
| Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
| Date Deposited: | 26 Nov 2025 09:51 |
| Last Modified: | 26 Nov 2025 10:34 |
| Published Version: | https://link.springer.com/chapter/10.1007/978-3-03... |
| Status: | Published |
| Publisher: | Springer Nature |
| Series Name: | Lecture Notes in Networks and Systems |
| Identification Number: | 10.1007/978-3-031-62281-6_15 |
| Related URLs: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:234852 |

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)