Aker, A., Kurtic, E., Balamurali, A.R. et al. (4 more authors) (2016) A Graph-Based Approach to Topic Clustering for Online Comments to News. In: Ferro, N., Crestani, F., Moens, M.-F., Mothe, J., Silvestri, F., Di Nunzio, G.M., Hauff, C. and Silvello, G., (eds.) Advances in Information Retrieval. ECIR 2016: European Conference on Information Retrieval, 20-23 Mar 2016, Padua, Italy. Lecture Notes in Computer Science, 9626 . Springer International Publishing , pp. 15-29. ISBN 978-3-319-30671-1
Abstract
This paper investigates graph-based approaches to labeled topic clustering of reader comments in online news. For graph-based clustering we propose a linear regression model of similarity between the graph nodes (comments) based on similarity features and weights trained using automatically derived training data. To label the clusters our graph-based approach makes use of DBPedia to abstract topics extracted from the clusters. We evaluate the clustering approach against gold standard data created by human annotators and compare its results against LDA – currently reported as the best method for the news comment clustering task. Evaluation of cluster labelling is set up as a retrieval task, where human annotators are asked to identify the best cluster given a cluster label. Our clustering approach significantly outperforms the LDA baseline and our evaluation of abstract cluster labels shows that graph-based approaches are a promising method of creating labeled clusters of news comments, although we still find cases where the automatically generated abstractive labels are insufficient to allow humans to correctly associate a label with its cluster.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Editors: |
|
Copyright, Publisher and Additional Information: | © 2016 Springer. This is an author produced version of a paper subsequently published in Advances in Information Retrieval (Lecture Notes in Computer Science). Uploaded in accordance with the publisher's self-archiving policy. |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) The University of Sheffield > Faculty of Medicine, Dentistry and Health (Sheffield) > Department of Human Communication Sciences (Sheffield) The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 16 Jun 2017 13:32 |
Last Modified: | 19 Dec 2022 13:36 |
Published Version: | http://dx.doi.org/10.1007/978-3-319-30671-1_2 |
Status: | Published |
Publisher: | Springer International Publishing |
Series Name: | Lecture Notes in Computer Science |
Refereed: | Yes |
Identification Number: | 10.1007/978-3-319-30671-1_2 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:117791 |