Dhafari, T.B., Pate, A., Martin, G.P. et al. (4 more authors) (2026) Cluster separation outperforms other metrics in validating multimorbidity patterns: statistical simulation study. Journal of Clinical Epidemiology, 194. 112209. ISSN: 0895-4356
Abstract
Background and Objectives Multimorbidity, defined as the presence of multiple long-term health conditions within an individual, remains a growing challenge in healthcare. Identifying frequently occurring multimorbidity clusters may help to develop targeted interventions and optimize care pathways. However, the validation of multimorbidity clusters derived from real-world data is complicated by the lack of a known “ground truth.” We conducted a statistical simulation study that aimed to evaluate the performance of three common validation approaches (cluster separation, clustering stability, and strength of association with health outcomes) in assessing the quality of multimorbidity clusters, where performance was measured by agreement with known ground truth clusters. Methods Simulated datasets with predefined clusters were generated across 25 scenarios, varying parameters such as disease prevalence, sample size, and noise levels. Latent class analysis was applied to derive clusters from the simulated data, which were compared to the predefined clusters using the adjusted rand index (ARI). The ARI served as our gold standard quality assessment of derived clusters. Results Cluster separation, measured by the Calinski–Harabasz index, showed the strongest agreement with our gold standard in most scenarios (median correlation: 0.641, IQR: 0.505–0.728). Clustering stability—assessed using resampling—had mixed performance, with a median correlation of 0.421 (IQR: 0.127–0.526). The strength of association with health outcomes, assessed using Nagelkerke's R2, consistently showed poor agreement (median correlation: −0.424, IQR: −0.543 to −0.173) with the ARI. Conclusion Cluster separation seems to be the most reliable approach to validate multimorbidity clusters. Clustering stability can sometimes be used for validation but has limitations. Assessing the strength of association of multimorbidity clusters with health outcomes, though valuable for understanding clinical relevance, appears to not validate cluster quality despite being commonly used in published literature.
Metadata
| Item Type: | Article |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | © 2026 The Authors. This is an open access article under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited. |
| Keywords: | Multimorbidity, Analytical method, Cluster analysis, Validation, Simulation study, Latent class analysis |
| Dates: |
|
| Institution: | The University of Leeds |
| Academic Units: | The University of Leeds > Faculty of Medicine and Health (Leeds) > School of Medicine (Leeds) |
| Funding Information: | Funder Grant number British Heart Foundation Accounts Payable - Gloria Sankey PG/19/54/34511 |
| Date Deposited: | 01 May 2026 14:43 |
| Last Modified: | 01 May 2026 14:43 |
| Status: | Published |
| Publisher: | Elsevier |
| Identification Number: | 10.1016/j.jclinepi.2026.112209 |
| Related URLs: | |
| Sustainable Development Goals: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:240665 |
Download
Filename: PIIS0895435626000843.pdf
Licence: CC-BY 4.0


CORE (COnnecting REpositories)
CORE (COnnecting REpositories)