Developing a corpus of plagiarised short answers

Abstract

Plagiarism is widely acknowledged to be a significant and increasing problem for higher education institutions (McCabe 2005; Judge 2008). A wide range of solutions, including several commercial systems, have been proposed to assist the educator in the task of identifying plagiarised work, or even to detect them automatically. Direct comparison of these systems is made difficult by the problems in obtaining genuine examples of plagiarised student work. We describe our initial experiences with constructing a corpus consisting of answers to short questions in which plagiarism has been simulated. This corpus is designed to represent types of plagiarism that are not included in existing corpora and will be a useful addition to the set of resources available for the evaluation of plagiarism detection systems.

Metadata

Item Type:	Article
Authors/Creators:	Clough, P. Stevenson, M.
Copyright, Publisher and Additional Information:	© 2011 Springer. This is an author produced version of a paper subsequently published in Language Resources and Evaluation. Uploaded in accordance with the publisher's self-archiving policy.
Keywords:	Plagiarism; Plagiarism detection; Corpus creation; Language resources
Dates:	Published: March 2011
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield)
Depositing User:	Miss Anthea Tucker
Date Deposited:	25 Mar 2011 12:12
Last Modified:	08 Feb 2013 17:31
Published Version:	http://dx.doi.org/10.1007/s10579-009-9112-1
Status:	Published
Publisher:	Springer
Refereed:	Yes
Identification Number:	10.1007/s10579-009-9112-1
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:42922

CORE (COnnecting REpositories)

Developing a corpus of plagiarised short answers

Abstract

Metadata

Download

Clough_42922

Export

Statistics