Saz, O., Doulaty, M. and Hain, T. (2014) Background-tracking acoustic features for genre identification of broadcast shows. In: Spoken Language Technology Workshop (SLT), 2014 IEEE. Spoken Language Technology Workshop (SLT), 07-10 Dec 2014, South Lake Tahoe, NV. IEEE , 118 - 123. ISBN 9781479971299
Abstract
This paper presents a novel method for extracting acoustic features that characterise the background environment in audio recordings. These features are based on the output of an alignment that fits multiple parallel background-based Constrained Maximum Likelihood Linear Regression transformations asynchronously to the input audio signal. With this setup, the resulting features can track changes in the audio background like appearance and disappearance of music, applause or laughter, independently of the speakers in the foreground of the audio. The ability to provide this type of acoustic description in audiovisual data has many potential applications, including automatic classification of broadcast archives or improving automatic transcription and subtitling. In this paper, the performance of these features in a genre identification task in a set of 332 BBC shows is explored. The proposed background-tracking features outperform short-term Perceptual Linear Prediction features in this task using Gaussian Mixture Model classifiers (62% vs 72% accuracy). The use of more complex classifiers, Hidden Markov Models and Support Vector Machines, increases the performance of the system with the novel background-tracking features to 79% and 81% in accuracy respectively.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 29 Jan 2016 15:49 |
Last Modified: | 19 Dec 2022 13:32 |
Published Version: | http://dx.doi.org/10.1109/SLT.2014.7078560 |
Status: | Published |
Publisher: | IEEE |
Refereed: | Yes |
Identification Number: | 10.1109/SLT.2014.7078560 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:92449 |