White Rose University Consortium logo
University of Leeds logo University of Sheffield logo York University logo

Analysis and display of the size dependence of chemical similarity coefficients

Holliday, J.D., Salim, N., Whittle, M. and Willett, P. (2003) Analysis and display of the size dependence of chemical similarity coefficients. Journal of Chemical Information and Computer Sciences, 43 (3). pp. 819-828. ISSN 0095-2338

Full text not available from this repository. (Request a copy)

Abstract

We discuss the size-bias inherent in several chemical similarity coefficients when used for the similarity searching or diversity selection of compound collections. Limits to the upper bounds of 14 standard similarity coefficients are investigated, and the results are used to identify some exceptional characteristics of a few of the coefficients. An additional numerical contribution to the known size bias in the Tanimoto coefficient is identified. Graphical plots with respect to relative bit density are introduced to further assess the coefficients. Our methods reveal the asymmetries inherent in most similarity coefficients that lead to bias in selection, most notably with the Forbes and Russell-Rao coefficients. Conversely, when applied to the recently introduced Modified Tanimoto coefficient our methods provide support for the view that it is less biased toward molecular size than most. In this work we focus our discussion on fragment-based bit strings, but we demonstrate how our approach can be generalized to continuous representations.

Item Type: Article
Institution: The University of Sheffield
Academic Units: The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield)
Depositing User: Information Studies
Date Deposited: 26 Aug 2009 09:50
Last Modified: 26 Aug 2009 09:50
Published Version: http://dx.doi.org/10.1021/ci034001x
Status: Published
Publisher: American Chemical Society
Identification Number: 10.1021/ci034001x
URI: http://eprints.whiterose.ac.uk/id/eprint/9229

Actions (repository staff only: login required)