Emotion Recognition from the Speech Signal by Effective Combination of Generative and Discriminative Models

Abstract

In this paper, we propose an effective way for combining the discriminative and generative models for emotion recognition from speech signal. Finding an efficient feature extraction algorithm which captures just the main attribute(s) pertinent to the task and filters out the other aspects of the data turns out to be very challenging, if not impossible. We propose an interface between the front-end and the back-end in order to compensate for the shortcoming of the parameterization block in suppressing the irrelevant dimensions of the signal. This interface is a generative model, which performs remarkable dimensionality reduction, allows for extraction of a long-term feature, and also paves the way for better classification of the data through a discriminative model. This method leads to a 7.6% absolute performance improvement in comparison with the baseline system and results in 87.6% accuracy in emotion recognition task. Human performance on the same database is reportedly 84.3%.

Metadata

Item Type:	Conference or Workshop Item
Authors/Creators:	Loweimi, E. Doulaty, M. Barker, J. Hain, T.
Keywords:	Discriminative model; Emotion recognition; Front-end; Generative model; Speech signal
Dates:	Published: 24 June 2015
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > USES (University of Sheffield Engineering Symposium)
Depositing User:	Repository Officer
Date Deposited:	22 Aug 2016 12:26
Last Modified:	25 Oct 2016 21:14
Status:	Published
Identification Number:	10.15445/02012015.36
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:103952

CORE (COnnecting REpositories)

Emotion Recognition from the Speech Signal by Effective Combination of Generative and Discriminative Models

Abstract

Metadata

Download

02012015.36_Loweimi_J

Export

Statistics