Hussain, T., Wang, W., Bouaynaya, N. et al. (2 more authors) (2022) Deep learning for audio visual emotion recognition. In: Proceedings of the 2022 25th International Conference on Information Fusion (FUSION). 2022 25th International Conference on Information Fusion (FUSION), 04-07 Jul 2022, Linköping, Sweden. Institute of Electrical and Electronics Engineers ISBN 9781665489416
Abstract
Human emotions can be presented in data with multiple modalities, e.g. video, audio and text. An automated system for emotion recognition needs to consider a number of challenging issues, including feature extraction, and dealing with variations and noise in data. Deep learning have been extensively used recently, offering excellent performance in emotion recognition. This work presents a new method based on audio and visual modalities, where visual cues facilitate the detection of the speech or non-speech frames and the emotional state of the speaker. Different from previous works, we propose the use of novel speech features, e.g. the Wavegram, which is extracted with a one-dimensional Convolutional Neural Network (CNN) learned directly from time-domain waveforms, and Wavegram-Logmel features which combines the Wavegram with the log mel spectrogram. The system is then trained in an end-to-end fashion on the SAVEE database by also taking advantage of the correlations among each of the streams. It is shown that the proposed approach outperforms the traditional and state-of-the art deep learning based approaches, built separately on auditory and visual handcrafted features for the prediction of spontaneous and natural emotions.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2022 The Authors. This accepted manuscript version is available under a Creative Commons Attribution CC BY licence. (http://creativecommons.org/licenses/by/4.0) |
Keywords: | Deep learning; convolutional neural networks; emotion recognition; audio and visual data |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Automatic Control and Systems Engineering (Sheffield) |
Funding Information: | Funder Grant number Engineering and Physical Sciences Research Council EP/T013265/1 |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 01 Jun 2022 13:26 |
Last Modified: | 12 Jan 2024 11:09 |
Status: | Published |
Publisher: | Institute of Electrical and Electronics Engineers |
Refereed: | Yes |
Identification Number: | 10.23919/FUSION49751.2022.9841342 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:187472 |