Ma, N. orcid.org/0000-0002-4112-3109, Brown, G. orcid.org/0000-0001-8565-5476 and May, T. (Accepted: 2015) Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions. In: Interspeech. INTERSPEECH 2015, 06-10 Sep 2015, Dresden, Germany. International Speech Communication Association , pp. 160-164.
Abstract
This paper presents a novel machine-hearing system that ex- ploits deep neural networks (DNNs) and head movements for binaural localisation of multiple speakers in reverberant conditions. DNNs are used to map binaural features, consisting of the complete cross-correlation function (CCF) and interaural level differences (ILDs), to the source azimuth. Our approach was evaluated using a localisation task in which sources were located in a full 360-degree azimuth range. As a result, front- back confusions often occurred due to the similarity of binaural features in the front and rear hemifields. To address this, a head movement strategy was incorporated in the DNN-based model to help reduce the front-back errors. Our experiments show that, compared to a system based on a Gaussian mixture model (GMM) classifier, the proposed DNN system substantially re- duces localisation errors under challenging acoustic scenarios in which multiple speakers and room reverberation are present.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | (c) 2015 International Speech Communication Association. Reproduced in accordance with the publisher's self-archiving policy. |
Keywords: | Binaural source localisation; deep neural net-works, head movements; machine hearing; reverberation |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Funding Information: | Funder Grant number EUROPEAN COMMISSION - FP6/FP7 TWO!EARS - 618075 |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 19 Aug 2016 08:52 |
Last Modified: | 19 Dec 2022 13:34 |
Published Version: | http://www.isca-speech.org/archive/interspeech_201... |
Status: | Published |
Publisher: | International Speech Communication Association |
Refereed: | Yes |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:102628 |