Towards building a standard dataset for Arabic keyphrase extraction evaluation

Abstract

Keyphrases are short phrases that best represent a document content. They can be useful in a variety of applications, including document summarization and retrieval models. In this paper, we introduce the first dataset of keyphrases for an Arabic document collection, obtained by means of crowdsourcing. We experimentally evaluate different crowdsourced answer aggregation strategies and validate their performances against expert annotations to evaluate the quality of our dataset. We report about our experimental results, the dataset features,

Metadata

Item Type:	Proceedings Paper
Authors/Creators:	Helmy, M. Basaldella, M. Maddalena, E. Mizzaro, S. Demartini, G. https://orcid.org/0000-0002-7311-3693
Copyright, Publisher and Additional Information:	© 2016 IEEE. This is an author produced version of a paper subsequently published in Asian Language Processing (IALP), 2016 International Conference on. Uploaded in accordance with the publisher's self-archiving policy.
Keywords:	Arabic Language Resources; Dataset; Keyphrase Extraction; Crowdsourcing
Dates:	Accepted: 31 August 2016 Published (online): 13 March 2017 Published: 2017
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	21 Nov 2016 16:38
Last Modified:	19 Dec 2022 13:34
Published Version:	https://doi.org/10.1109/IALP.2016.7875927
Status:	Published
Publisher:	IEEE
Refereed:	Yes
Identification Number:	10.1109/IALP.2016.7875927
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:107611

CORE (COnnecting REpositories)

Towards building a standard dataset for Arabic keyphrase extraction evaluation

Abstract

Metadata

Download

Accepted Version

Export

Statistics