White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

Clustering files of chemical structures using the Szekely-Rizzo generalization of Ward's method

Varin, T., Bureau, R., Mueller, C. and Willett, P. (2009) Clustering files of chemical structures using the Szekely-Rizzo generalization of Ward's method. Journal of Molecular Graphics and Modelling, 28 (2). pp. 187-195. ISSN 1093-3263

Full text available as:
[img] Text
Willett_10328.pdf

Download (447Kb)

Abstract

Ward's method is extensively used for clustering chemical structures represented by 2D fingerprints. This paper compares Ward clusterings of 14 datasets (containing between 278 and 4332 molecules) with those obtained using the Szekely–Rizzo clustering method, a generalization of Ward's method. The clusters resulting from these two methods were evaluated by the extent to which the various classifications were able to group active molecules together, using a novel criterion of clustering effectiveness. Analysis of a total of 1400 classifications (Ward and Székely–Rizzo clustering methods, 14 different datasets, 5 different fingerprints and 10 different distance coefficients) demonstrated the general superiority of the Székely–Rizzo method. The distance coefficient first described by Soergel performed extremely well in these experiments, and this was also the case when it was used in simulated virtual screening experiments.

Item Type: Article
Copyright, Publisher and Additional Information: © 2009 Elsevier. This is an author produced version of a paper subsequently published in Journal of Molecular Graphics and Modelling. Uploaded in accordance with the publisher's self-archiving policy.
Keywords: Clustering method; Distance coefficient; Energy clustering; Fingerprint; Fragment substructure; Joint between-within distance; Minimum variance clustering method; Soergel coefficient; Szekely–Rizzo clustering method; Ward's clustering method
Institution: The University of Sheffield
Academic Units: The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield)
Depositing User: Miss Anthea Tucker
Date Deposited: 27 Jan 2010 16:49
Last Modified: 08 Feb 2013 16:59
Published Version: http://dx.doi.org/10.1016/j.jmgm.2009.06.006
Status: Published
Publisher: Elsevier
Refereed: Yes
Identification Number: 10.1016/j.jmgm.2009.06.006
URI: http://eprints.whiterose.ac.uk/id/eprint/10328

Actions (repository staff only: login required)