Rønning, O., Ley, C., Mardia, K.V. orcid.org/0000-0003-0090-6235 et al. (1 more author) (2021) Time-efficient Bayesian Inference for a (Skewed) Von Mises Distribution on the Torus in a Deep Probabilistic Programming Language. In: 2021 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI). 2021 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), 23-25 Sep 2021, Karlsruhe, Germany. IEEE ISBN 978-1-6654-4522-1
Abstract
Probabilistic programming languages (PPLs) are at the interface between statistics and the theory of programming languages. PPLs formulate statistical models as stochastic programs that enable automatic inference algorithms and optimization. Pyro [1] and its sibling NumPyro [2] are PPLs built on top of the deep learning frameworks PyTorch [3] and Jax [4], respectively. Both PPLs provide simple, highly similar interfaces for inference using efficient implementations of Hamiltonian Monte Carlo (HMC), the No-U-Turn Sampler (NUTS), and Stochastic Variational Inference (SVI). They automatically generate variational distributions from a model, automatically enumerate discrete variables, and support formulating deep probabilistic models such as variational autoencoders and deep Markov models. The Sine von Mises distribution and its skewed variant are toroidal distributions relevant to protein bioinformatics. They provide a natural way to model the dihedral angles of protein structures, which is important in protein structure prediction, simulation and analysis. We present efficient implementations of the Sine von Mises distribution and its skewing in Pyro and NumPyro, and devise a simulation method that increases efficiency with several orders of magnitude when using parallel hardware (i.e., modern CPUs, GPUs, and TPUs). We demonstrate the use of the skewed Sine von Mises distribution by modeling dihedral angles of proteins using a Bayesian mixture model inferred using NUTS, exploiting NumPyro's facilities for automatic enumeration [5].
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Keywords: | Proteins, Computer languages, Analytical models, Monte Carlo methods, Biological system modeling, Mixture models, Predictive models |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Mathematics (Leeds) > Statistics (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 05 Jun 2024 12:16 |
Last Modified: | 05 Jun 2024 12:16 |
Published Version: | http://dx.doi.org/10.1109/mfi52462.2021.9591184 |
Status: | Published |
Publisher: | IEEE |
Identification Number: | 10.1109/mfi52462.2021.9591184 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:213133 |