Roa-Dabike, G. orcid.org/0000-0001-7839-8061, Cox, T.J. orcid.org/0000-0002-4075-7564, Barker, J.P. orcid.org/0000-0002-1684-5660 et al. (8 more authors) (2026) The Cadenza lyric intelligibility prediction (CLIP) dataset. Data in Brief, 65. 112466. ISSN: 2352-3409
Abstract
This paper presents CLIP, a dataset of 11,072 popular western music signals sourced from independent artists, accompanied by ground truth lyrics, and lyric intelligibility scores from listening tests. The dataset is designed to facilitate music information retrieval (MIR) research using machine learning. It was created to allow the development of algorithms to predict lyric intelligibility for the Cadenza ICASSP 2026 Signal Processing Grand Challenge. Currently, it is the only publicly available large-scale dataset for such a task. The music was sourced from the Free Music Archive (FMA) dataset and is unlikely to be familiar to listeners. We excluded tracks whose license did not allow derivative works and those that did not have English singing. Ground truth transcriptions were generated by seven native English speakers, resulting in 3700 excerpts of 5 to 10 words each from 1452 different songs. A hearing loss simulation was also applied to the stereo audio. This resulted in 11,100 music signals with no, mild or moderate hearing loss. This was done so more diverse hearing is represented in the dataset. Human transcriptions were then collected via an online listening experiment. Participants self-reported as having normal-hearing and being native English speakers. They listened to each music signal twice before transcribing each line. Final intelligibility scores were the ratio of matching words between the listening test responses and the ground truth transcriptions. The final dataset consists of audio, ground truth lyrics, intelligibility scores and associated metadata.
Metadata
| Item Type: | Article |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | © 2026 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/) |
| Keywords: | Music; Singing; English; MIR; Deep learning; Machine learning; Hearing; Hearing loss |
| Dates: |
|
| Institution: | The University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
| Funding Information: | Funder Grant number Engineering and Physical Sciences Research Council EP/W019434/1 |
| Date Deposited: | 04 Feb 2026 12:36 |
| Last Modified: | 04 Feb 2026 12:36 |
| Status: | Published |
| Publisher: | Elsevier BV |
| Refereed: | Yes |
| Identification Number: | 10.1016/j.dib.2026.112466 |
| Related URLs: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:237441 |

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)