Thorne, J., Vlachos, A. orcid.org/0000-0003-2123-5071, Christodoulopoulos, C. et al. (1 more author) (2018) FEVER: a large-scale dataset for Fact Extraction and VERification. In: Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. The 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 01-06 Jun 2018, New Orleans.
Abstract
In this paper we introduce a new publicly available dataset for verification against textual sources, FEVER: Fact Extraction and VERification. It consists of 185,445 claims generated by altering sentences extracted from Wikipedia and subsequently verified without knowledge of the sentence they were derived from. The claims are classified as SUPPORTED, REFUTED or NOTENOUGHINFO by annotators achieving 0.6841 in Fleiss κ. For the first two classes, the annotators also recorded the sentence(s) forming the necessary evidence for their judgment. To characterize the challenge of the dataset presented, we develop a pipeline approach and compare it to suitably designed oracles. The best accuracy we achieve on labeling a claim accompanied by the correct evidence is 31.87%, while if we ignore the evidence we achieve 50.91%. Thus we believe that FEVER is a challenging testbed that will help stimulate progress on claim verification against textual sources.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2018 The Author(s). For reuse permissions, please contact the Author(s). |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Funding Information: | Funder Grant number EUROPEAN COMMISSION - HORIZON 2020 SUMMA - 688139 |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 12 Jun 2018 14:18 |
Last Modified: | 19 Dec 2022 15:38 |
Published Version: | https://arxiv.org/abs/1803.05355 |
Status: | Published |
Refereed: | Yes |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:131937 |