White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

Retrieving descriptive phrases from large amounts of free text

Joho, H. and Sanderson, M. (2000) Retrieving descriptive phrases from large amounts of free text. In: Proceedings of the ninth international conference on Information and knowledge management. Ninth international Conference on Information and Knowledge Management (CIKM), November 06 - 11, 2000, Mclean, USA. ACM , New York, USA , pp. 180-186. ISBN 1-58113-320-0


Download (101Kb)


This paper presents a system that retrieves descriptive phrases of proper nouns from free text. Sentences holding the specified noun are ranked using a technique based on pattern matching, word counting, and sentence location. No domain specific knowledge is used. Experiments show the system able to rank highly those sentences that contain phrases describing or defining the query noun. In contrast to existing methods, this system does not use parsing techniques but still achieves high levels of accuracy. From the results of a large-scale experiment, it is speculated that the success of this simpler method is due to the high quantities of free text being searched. Parallels between this work and recent findings in the very large corpus track of TREC are drawn.

Item Type: Proceedings Paper
Copyright, Publisher and Additional Information: Uploaded in accordance with the publisher's self-archiving policy.
Keywords: information retrieval, descriptive phrase, large corpora
Institution: The University of Sheffield
Academic Units: The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield)
Depositing User: Repository Officer
Date Deposited: 28 Nov 2008 13:31
Last Modified: 08 Feb 2013 16:57
Published Version: http://dx.doi.org/10.1145/354756.354817
Status: Published
Publisher: ACM
Identification Number: 10.1145/354756.354817
Related URLs:
URI: http://eprints.whiterose.ac.uk/id/eprint/4549

Actions (repository staff only: login required)