Robust binaural localization of a target sound source by combining spectral source models and deep neural networks

Abstract

Despite there being a clear evidence for top–down (e.g., attentional) effects in biological spatial hearing, relatively few machine hearing systems exploit the top–down model-based knowledge in sound localization. This paper addresses this issue by proposing a novel framework for the binaural sound localization that combines the model-based information about the spectral characteristics of sound sources and deep neural networks (DNNs). A target source model and a background source model are first estimated during a training phase using spectral features extracted from sound signals in isolation. When the identity of the background source is not available, a universal background model can be used. During testing, the source models are used jointly to explain the mixed observations and improve the localization process by selectively weighting source azimuth posteriors output by a DNN-based localization system. To address the possible mismatch between the training and testing, a model adaptation process is further employed the on-the-fly during testing, which adapts the background model parameters directly from the noisy observations in an iterative manner. The proposed system, therefore, combines the model-based and data-driven information flow within a single computational framework. The evaluation task involved localization of a target speech source in the presence of an interfering source and room reverberation. Our experiments show that by exploiting the model-based information in this way, the sound localization performance can be improved substantially under various noisy and reverberant conditions.

Metadata

Item Type:	Article
Authors/Creators:	Ma, N. Gonzalez, J. Brown, G.J. https://orcid.org/0000-0001-8565-5476
Copyright, Publisher and Additional Information:	© 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Reproduced in accordance with the publisher's self-archiving policy.
Keywords:	binaural source localisation; machine hearing; reverberation; sound source combination; masking
Dates:	Accepted: 10 July 2018 Published (online): 13 July 2018 Published: 13 July 2018
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield)
Funding Information:	Funder Grant number EUROPEAN COMMISSION - FP6/FP7 TWO!EARS - 618075
Depositing User:	Symplectic Sheffield
Date Deposited:	23 Jul 2018 10:22
Last Modified:	20 Aug 2018 09:49
Published Version:	https://doi.org/10.1109/TASLP.2018.2855960
Status:	Published
Publisher:	Institute of Electrical and Electronics Engineers
Refereed:	Yes
Identification Number:	10.1109/TASLP.2018.2855960
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:133240

CORE (COnnecting REpositories)

Robust binaural localization of a target sound source by combining spectral source models and deep neural networks

Abstract

Metadata

Download

Accepted Version

Export

Statistics