This is the latest version of this eprint.
Mardia, KV, Barber, S, Burdett, PM et al. (2 more authors) (2022) Mixture Models for Spherical Data with Applications to Protein Bioinformatics. In: SenGupta, A and Arnold, B, (eds.) Directional Statistics for Innovative Applications: A Bicentennial Tribute to Florence Nightingale. Forum for Interdisciplinary Mathematics . Springer Singapore , pp. 15-32. ISBN 978-9811910432
Abstract
Finite mixture models are fitted to spherical data. Kent distributions are used for the components of the mixture because they allow considerable flexibility. Previous work on such mixtures has used an approximate maximum likelihood estimator for the parameters of a single component. However, the approximation causes problems when using the EM algorithm to estimate the parameters in a mixture model. Hence, the exact maximum likelihood estimator is used here for the individual components. This paper is motivated by a challenging prize problem in structural bioinformatics of how proteins fold. It is known that hydrogen bonds play a key role in the folding of a protein. We explore this hydrogen bond geometry using a data set describing bonds between two amino acids in proteins. An appropriate coordinate system to represent the hydrogen bond geometry is proposed, with each bond represented as a point on a sphere. We fit mixtures of Kent distributions to different subsets of the hydrogen bond data to gain insight into how the secondary structure elements bond together, since the distribution of hydrogen bonds depends on which secondary structure elements are involved.
Metadata
Item Type: | Book Section |
---|---|
Authors/Creators: |
|
Editors: |
|
Copyright, Publisher and Additional Information: | © 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. This version of the article has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use (https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms), but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at https://doi.org/10.1007/978-981-19-1044-9_2. |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Mathematics (Leeds) > Statistics (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 08 Apr 2022 14:01 |
Last Modified: | 16 Jun 2024 00:13 |
Status: | Published |
Publisher: | Springer Singapore |
Series Name: | Forum for Interdisciplinary Mathematics |
Identification Number: | 10.1007/978-981-19-1044-9_2 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:185512 |
Available Versions of this Item
-
Mixture models for spherical data with applications to protein bioinformatics. (deposited 08 Apr 2022 10:46)
- Mixture Models for Spherical Data with Applications to Protein Bioinformatics. (deposited 08 Apr 2022 14:01) [Currently Displayed]