Checco, A. orcid.org/0000-0002-0981-3409, Bates, J. and Demartini, G. (2020) Adversarial attacks on crowdsourcing quality control. Journal of Artificial Intelligence Research, 67 (2020). pp. 375-408. ISSN 1076-9757
Abstract
Crowdsourcing is a popular methodology to collect manual labels at scale. Such labels are often used to train AI models and, thus, quality control is a key aspect in the process. One of the most popular quality assurance mechanisms in paid micro-task crowdsourcing is based on gold questions: the use of a small set of tasks of which the requester knows the correct answer and, thus, is able to directly assess crowd work quality. In this paper, we show that such mechanism is prone to an attack carried out by a group of colluding crowd workers that is easy to implement and deploy: the inherent size limit of the gold set can be exploited by building an inferential system to detect which parts of the job are more likely to be gold questions. The described attack is robust to various forms of randomisation and programmatic generation of gold questions. We present the architecture of the proposed system, composed of a browser plug-in and an external server used to share information, and briefly introduce its potential evolution to a decentralised implementation. We implement and experimentally validate the gold detection system, using real-world data from a popular crowdsourcing platform. Our experimental results show that crowd workers using the proposed system spend more time on signalled gold questions but do not neglect the others thus achieving an increased overall work quality. Finally, we discuss the economic and sociological implications of this kind of attack.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2020 AI Access Foundation. Reproduced in accordance with the publisher's self-archiving policy. |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield) |
Funding Information: | Funder Grant number European Commission - Horizon 2020 732328 |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 15 Jan 2020 15:53 |
Last Modified: | 26 May 2020 08:32 |
Status: | Published |
Publisher: | AI Access Foundation |
Refereed: | Yes |
Identification Number: | 10.1613/jair.1.11332 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:155704 |