Schröder, J., Moritz, N., Anemüller, J. et al. (2 more authors) (2017) Classifier architectures for acoustic scenes and events : implications for DNNs, TDNNs, and perceptual features from DCASE 2016. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25 (6). pp. 1304-1314. ISSN 2329-9290
Abstract
This paper evaluates neural network (NN) based systems and compares them to Gaussian mixture model (GMM) and hidden Markov model (HMM) approaches for acoustic scene classification (SC) and polyphonic acoustic event detection (AED) that are applied to data of the “Detection and Classification of Acoustic Scenes and Events 2016” (DCASE'16) challenge, task 1 and task 3, respectively. For both tasks, the use of deep neural networks (DNNs) and features based on an amplitude modulation filterbank and a Gabor filterbank (GFB) are evaluated and compared to standard approaches. For SC, additionally a time-delay NN approach is proposed that enables analysis of long contextual information similar to recurrent NNs but with training efforts comparable to conventional DNNs. The SC system proposed for task 1 of the DCASE'16 challenge attains a recognition accuracy of 77.5%, which is 5.6% higher compared to the DCASE'16 baseline system. For the AED task, DNNs are adopted in tandem and hybrid approaches, i.e., as part of HMM-based systems. These systems are evaluated for the polyphonic data of task 3 from the DCASE'16 challenge. Several strategies to address the issue of polyphony are considered. It is shown that DNN-based systems perform less accurate than the traditional systems for this task. Best results are achieved using GFB features in combination with a multiclass GMM-HMM back end.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2017 IEEE. |
Keywords: | Acoustic event detection; amplitude modulation filterbank; DCASE’16; deep neural network; Gabor filterbank; scene classification; time-delay neural network |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 06 Apr 2020 10:45 |
Last Modified: | 06 Apr 2020 10:45 |
Status: | Published |
Publisher: | Institute of Electrical and Electronics Engineers (IEEE) |
Refereed: | Yes |
Identification Number: | 10.1109/taslp.2017.2690569 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:159153 |