Bonazzola, R. orcid.org/0000-0001-8811-2581, Ferrante, E., Ravikumar, N. orcid.org/0000-0003-0134-107X et al. (5 more authors) (2024) Unsupervised ensemble-based phenotyping enhances discoverability of genes related to left-ventricular morphology. Nature Machine Intelligence, 6. pp. 291-306. ISSN 2522-5839
Abstract
Recent genome-wide association studies have successfully identified associations between genetic variants and simple cardiac morphological parameters derived from cardiac magnetic resonance images. However, the emergence of large databases, including genetic data linked to cardiac magnetic resonance facilitates the investigation of more nuanced patterns of cardiac shape variability than those studied so far. Here we propose a framework for gene discovery coined unsupervised phenotype ensembles. The unsupervised phenotype ensemble builds a redundant yet highly expressive representation by pooling a set of phenotypes learnt in an unsupervised manner, using deep learning models trained with different hyperparameters. These phenotypes are then analysed via genome-wide association studies, retaining only highly confident and stable associations across the ensemble. We applied our approach to the UK Biobank database to extract geometric features of the left ventricle from image-derived three-dimensional meshes. We demonstrate that our approach greatly improves the discoverability of genes that influence left ventricle shape, identifying 49 loci with study-wide significance and 25 with suggestive significance. We argue that our approach would enable more extensive discovery of gene associations with image-derived phenotypes for other organs or image modalities.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2024 Springer Nature Limited This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. |
Keywords: | 46 Information and Computing Sciences; 40 Engineering; Networking and Information Technology R&D (NITRD); Human Genome; Biomedical Imaging; Genetics; Machine Learning and Artificial Intelligence; Cardiovascular; Heart Disease; 2.1 Biological and endogenous factors; Generic health relevance; Cardiovascular |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) > Biomedical & Health The University of Leeds > Faculty of Medicine and Health (Leeds) > School of Medicine (Leeds) > Leeds Institute of Cardiovascular and Metabolic Medicine (LICAMM) > Biomedical Imaging Science Dept (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 25 Jul 2024 13:05 |
Last Modified: | 25 Jul 2024 13:05 |
Published Version: | http://dx.doi.org/10.1038/s42256-024-00801-1 |
Status: | Published |
Publisher: | Springer Nature |
Identification Number: | 10.1038/s42256-024-00801-1 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:215244 |