White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

The impact on retrieval effectiveness of skewed frequency distributions

Sanderson, M. and van Rijsbergen, C.J. (1999) The impact on retrieval effectiveness of skewed frequency distributions. ACM Transactions on Information Systems (TOIS), 17 (4). pp. 440-465. ISSN 1046-8188


We present an analysis of word senses that provides a fresh insight into the impact of word ambiguity on retrieval effectiveness with potential broader implications for other processes of information retrieval. Using a methodology of forming artificially ambiguous, words known as pseudo-words, and through reference to other researchers’ work, the analysis illustrates that the distribution of the frequency of occurrence of the senses of a word plays a strong role in ambiguity’s impact on effectiveness. Further investigation shows that this analysis may also be applicable to other processes of retrieval, such as Cross Language Information Retrieval, query expansion, retrieval of OCR’ed texts, and stemming. The analysis appears to provide a means of explaining, at least in part, reasons for the processes’ impact (or lack of it) on effectiveness.

Item Type: Article
Copyright, Publisher and Additional Information: © 2000 ACM. This is an author produced version of a paper published in ACM Transactions on Information Systems (TOIS). Uploaded in accordance with the publisher's self-archiving policy.
Keywords: word sense ambiguity, word sense disambiguation, pseudo-words
Institution: The University of Sheffield
Academic Units: The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield)
Depositing User: Repository Officer
Date Deposited: 03 Sep 2008 17:39
Last Modified: 13 Sep 2014 07:41
Published Version: http://dx.doi.org/10.1145/326440.326447
Status: Published
Publisher: ACM
Refereed: Yes
Identification Number: 10.1145/326440.326447
URI: http://eprints.whiterose.ac.uk/id/eprint/4593

Actions (repository staff only: login required)