Wang, J.K. orcid.org/0000-0003-0048-3893 and Gaizauskas, R. orcid.org/0000-0002-3356-5126 (2016) Cross-validating Image Description Datasets and Evaluation Metrics. In: Calzolari, N., Choukri, K., Declerck, T., Goggi, S., Grobelnik, M., Maegaard, B., Mariani, J., Mazo, H., Moreno, A., Odijk, J. and Piperidis, S., (eds.) Proceedings of the 10th Language Resources and Evaluation Conference. 10th Language Resources and Evaluation Conference (LREC 2016), 23-28 May 2016, Portorož, Slovenia. European Language Resources Association , pp. 3059-3066. ISBN 978-2-9517408-9-1
Abstract
The task of automatically generating sentential descriptions of image content has become increasingly popular in recent years, resulting in the development of large-scale image description datasets and the proposal of various metrics for evaluating image description generation systems. However, not much work has been done to analyse and understand both datasets and the metrics. In this paper, we propose using a leave-one-out cross validation (LOOCV) process as a means to analyse multiply annotated, human-authored image description datasets and the various evaluation metrics, i.e. evaluating one image description against other human-authored descriptions of the same image. Such an evaluation process affords various insights into the image description datasets and evaluation metrics, such as the variations of image descriptions within and across datasets and also what the metrics capture. We compute and analyse (i) human upper-bound performance; (ii) ranked correlation between metric pairs across datasets; (iii) lower-bound performance by comparing a set of descriptions describing one image to another sentence not describing that image. Interesting observations are made about the evaluation metrics and image description datasets, and we conclude that such cross-validation methods are extremely useful for assessing and gaining insights into image description datasets and evaluation metrics for image descriptions.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Editors: |
|
Copyright, Publisher and Additional Information: | © 2016 the European Language Resources Association. This is an Open Access article distributed under the terms of theCreative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. You may not use the material for commercial purposes. |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Funding Information: | Funder Grant number ENGINEERING AND PHYSICAL SCIENCE RESEARCH COUNCIL (EPSRC) EP/K019082/1 |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 19 May 2016 14:42 |
Last Modified: | 19 Dec 2022 13:33 |
Published Version: | http://www.lrec-conf.org/proceedings/lrec2016/summ... |
Status: | Published |
Publisher: | European Language Resources Association |
Refereed: | Yes |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:99025 |