On the vulnerabilities of Text-to-SQL models

Peng, X., Zhang, Y., Yang, J. et al. (1 more author) (2023) On the vulnerabilities of Text-to-SQL models. In: 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE) Proceedings. 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE), 09-12 Oct 2023, Florence, Italy. Institute of Electrical and Electronics Engineers (IEEE) ISBN 9798350315950

Abstract

Although it has been demonstrated that Natural Language Processing (NLP) algorithms are vulnerable to deliberate attacks, the question of whether such weaknesses can lead to software security threats is under-explored. To bridge this gap, we conducted vulnerability tests on Text-to-SQL systems that are commonly used to create natural language interfaces to databases. We showed that the Text-to-SQL modules within six commercial applications can be manipulated to produce malicious code, potentially leading to data breaches and Denial of Service attacks. 1 This is the first demonstration that NLP models can be exploited as attack vectors in the wild. In addition, experiments using four open-source language models verified that straightforward backdoor attacks on Text-to-SQL systems achieve a 100% success rate without affecting their performance. The aim of this work is to draw the community’s attention to potential software security issues associated with NLP algorithms and encourage exploration of methods to mitigate against them.

Metadata

Item Type:	Proceedings Paper
Authors/Creators:	Peng, X. Zhang, Y. Yang, J. Stevenson, R.M. https://orcid.org/0000-0002-9483-6006
Copyright, Publisher and Additional Information:	© 2023 The Author(s). Except as otherwise noted, this author-accepted version of a paper published in 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE) Proceedings is made available via the University of Sheffield Research Publications and Copyright Policy under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/
Keywords:	Natural Language Processing; Code Generation; Database; SQL Injection; Reliability Threat
Dates:	Accepted: 30 July 2023 Published (online): 2 November 2023 Published: 2 November 2023
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	15 Sep 2023 09:27
Last Modified:	10 Nov 2023 12:51
Status:	Published
Publisher:	Institute of Electrical and Electronics Engineers (IEEE)
Refereed:	Yes
Identification Number:	10.1109/ISSRE59848.2023.00047
Related URLs:	Conference
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:203349

Download

Accepted Version

Filename: issre23.pdf

Licence: CC-BY 4.0

CLICK TO DOWNLOAD

CORE (COnnecting REpositories)

On the vulnerabilities of Text-to-SQL models

Abstract

Metadata

Download

Accepted Version

Export

Statistics