White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

Deriving concept hierarchies from text

Sanderson, M. and Croft, B. (1999) Deriving concept hierarchies from text. In: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. Annual ACM Conference on Research and Development in Information Retrieval, August 15 - 19, 1999, Berkeley, California. ACM , New York , pp. 206-213. ISBN 1-58113-096-1


This paper presents a means of automatically deriving a hierarchical organization of concepts from a set of documents without use of training data or standard clustering techniques. Instead, salient words and phrases extracted from the documents are organized hierarchically using a type of co-occurrence known as subsumption. The resulting structure is displayed as a series of hierarchical menus. When generated from a set of retrieved documents, a user browsing the menus is provided with a detailed overview of their content in a manner distinct from existing overview and summarization techniques. The methods used to build the structure are simple, but appear to be effective: a small-scale user study reveals that the generated hierarchy possesses properties expected of such a structure in that general terms are placed at the top levels leading to related and more specific terms below. The formation and presentation of the hierarchy is described along with the user study and some other informal evaluations.

Item Type: Proceedings Paper
Copyright, Publisher and Additional Information: Copyright 1999 ACM. This is an author produced version of a paper published in "Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval". Uploaded in accordance with the publisher's self-archiving policy.
Keywords: Concept hierarchy, subsumption, term co-occurrence, multidocument summary
Institution: The University of Sheffield
Academic Units: The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield)
Depositing User: Repository Officer
Date Deposited: 08 Sep 2008 17:25
Last Modified: 15 Sep 2014 03:39
Published Version: http://dx.doi.org/10.1145/312624.312679
Status: Published
Publisher: ACM
Identification Number: 10.1145/312624.312679
URI: http://eprints.whiterose.ac.uk/id/eprint/4582

Actions (repository staff only: login required)