Sanderson, M. and van Rijsbergen, C.J. (1999) The impact on retrieval effectiveness of skewed frequency distributions. ACM Transactions on Information Systems (TOIS), 17 (4). pp. 440-465. ISSN 1046-8188
Abstract
We present an analysis of word senses that provides a fresh insight into the impact of word ambiguity on retrieval effectiveness with potential broader implications for other processes of information retrieval. Using a methodology of forming artificially ambiguous, words known as pseudo-words, and through reference to other researchers’ work, the analysis illustrates that the distribution of the frequency of occurrence of the senses of a word plays a strong role in ambiguity’s impact on effectiveness. Further investigation shows that this analysis may also be applicable to other processes of retrieval, such as Cross Language Information Retrieval, query expansion, retrieval of OCR’ed texts, and stemming. The analysis appears to provide a means of explaining, at least in part, reasons for the processes’ impact (or lack of it) on effectiveness.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2000 ACM. This is an author produced version of a paper published in ACM Transactions on Information Systems (TOIS). Uploaded in accordance with the publisher's self-archiving policy. |
Keywords: | word sense ambiguity, word sense disambiguation, pseudo-words |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield) |
Depositing User: | Repository Officer |
Date Deposited: | 03 Sep 2008 17:39 |
Last Modified: | 13 Sep 2014 07:41 |
Published Version: | http://dx.doi.org/10.1145/326440.326447 |
Status: | Published |
Publisher: | ACM |
Refereed: | Yes |
Identification Number: | 10.1145/326440.326447 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:4593 |