Deo, Yash, Jia, Yan orcid.org/0000-0002-5446-6565, Lassila, Toni et al. (5 more authors) (2026) A Calibrated Memorization Index (MI) for Detecting Training Data Leakage in Generative MRI Models. [Preprint]
Abstract
Image generative models are known to duplicate images from the training data as part of their outputs, which can lead to privacy concerns when used for medical image generation. We propose a calibrated per-sample metric for detecting memorization and duplication of training data. Our metric uses image features extracted using an MRI foundation model, aggregates multi-layer whitened nearest-neighbor similarities, and maps them to a bounded \emph{Overfit/Novelty Index} (ONI) and \emph{Memorization Index} (MI) scores. Across three MRI datasets with controlled duplication percentages and typical image augmentations, our metric robustly detects duplication and provides more consistent metric values across datasets. At the sample level, our metric achieves near-perfect detection of duplicates.
Metadata
| Item Type: | Preprint |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | Accepted in ISBI 2026 |
| Keywords: | cs.CV |
| Dates: |
|
| Institution: | The University of York |
| Academic Units: | The University of York > Faculty of Sciences (York) > Computer Science (York) |
| Date Deposited: | 11 Mar 2026 17:00 |
| Last Modified: | 11 Mar 2026 20:45 |
| Published Version: | https://doi.org/10.48550/arXiv.2602.13066 |
| Status: | Published |
| Publisher: | arXiv |
| Identification Number: | 10.48550/arXiv.2602.13066 |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:238949 |

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)