Ma, N. orcid.org/0000-0002-4112-3109, Brown, G. orcid.org/0000-0001-8565-5476 and May, T. (Accepted: 2015) Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions. In: Interspeech. INTERSPEECH 2015, 06-10 Sep 2015, Dresden, Germany. International Speech Communication Association , pp. 160-164.
Abstract
This paper presents a novel machine-hearing system that ex- ploits deep neural networks (DNNs) and head movements for binaural localisation of multiple speakers in reverberant conditions. DNNs are used to map binaural features, consisting of the complete cross-correlation function (CCF) and interaural level differences (ILDs), to the source azimuth. Our approach was evaluated using a localisation task in which sources were located in a full 360-degree azimuth range. As a result, front- back confusions often occurred due to the similarity of binaural features in the front and rear hemifields. To address this, a head movement strategy was incorporated in the DNN-based model to help reduce the front-back errors. Our experiments show that, compared to a system based on a Gaussian mixture model (GMM) classifier, the proposed DNN system substantially re- duces localisation errors under challenging acoustic scenarios in which multiple speakers and room reverberation are present.
Metadata
Authors/Creators: |
|
||||
---|---|---|---|---|---|
Copyright, Publisher and Additional Information: | (c) 2015 International Speech Communication Association. Reproduced in accordance with the publisher's self-archiving policy. | ||||
Keywords: | Binaural source localisation; deep neural net-works, head movements; machine hearing; reverberation | ||||
Dates: |
|
||||
Institution: | The University of Sheffield | ||||
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) | ||||
Funding Information: |
|
||||
Depositing User: | Symplectic Sheffield | ||||
Date Deposited: | 19 Aug 2016 08:52 | ||||
Last Modified: | 19 Dec 2022 13:34 | ||||
Published Version: | http://www.isca-speech.org/archive/interspeech_201... | ||||
Status: | Published | ||||
Publisher: | International Speech Communication Association | ||||
Refereed: | Yes |