Al-Maskari, A., Sanderson, M., Clough, P. and Airio, E. (2008) The Good and the Bad System: Does the Test Collection Predict Users’ Effectiveness? In: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. SIGIR '08, July 20-24, 2008, Singapore, Singapore. ACM , New York, USA , pp. 59-66. ISBN 978-1-60558-164-4
Test collections are extensively used in the evaluation of information retrieval systems. Crucial to their use is the degree to which results from them predict user effectiveness. At first, past studies did not substantiate a relationship between system and user effectiveness; more recently, however, correlations have begun to emerge. The results of this paper strengthen and extend those findings. We introduce a novel methodology for investigating the relationship, which shows great success in establishing a significant correlation between system and user effectiveness. It is shown that users behave differently and discern differences between pairs of systems that have a very small absolute difference in test collection effectiveness. Our results strengthen the use of test collections in IR evaluation, confirming that users' effectiveness can be predicted successfully.
|Keywords:||user study, effectiveness measures, test collection|
|Institution:||The University of Sheffield|
|Academic Units:||The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield)|
|Depositing User:||Repository Officer|
|Date Deposited:||19 Nov 2008 10:40|
|Last Modified:||19 Nov 2008 10:40|