Zeiler, S., Nicheli, R., Ma, N. orcid.org/0000-0002-4112-3109 et al. (2 more authors) (2016) Robust audiovisual speech recognition using noise-adaptive linear discriminant analysis. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 20-25 Mar 2016, Shanghai. IEEE , pp. 2797-2801. ISBN 9781479999880
Abstract
© 2016 IEEE.Automatic speech recognition (ASR) has become a widespread and convenient mode of human-machine interaction, but it is still not sufficiently reliable when used under highly noisy or reverberant conditions. One option for achieving far greater robustness is to include another modality that is unaffected by acoustic noise, such as video information. Currently the most successful approaches for such audiovisual ASR systems, coupled hidden Markov models (HMMs) and turbo decoding, both allow for slight asynchrony between audio and video features, and significantly improve recognition rates in this way. However, both typically still neglect residual errors in the estimation of audio features, so-called observation uncertainties. This paper compares two strategies for adding these observation uncertainties into the decoder, and shows that significant recognition rate improvements are achievable for both coupled HMMs and turbo decoding.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works |
Keywords: | Audiovisual speech recognition; uncertainty-of- observation techniques; discriminative transformation |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Funding Information: | Funder Grant number EUROPEAN COMMISSION - FP6/FP7 TWO!EARS - 618075 |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 18 Aug 2016 10:02 |
Last Modified: | 19 Dec 2022 13:34 |
Published Version: | http://dx.doi.org/10.1109/ICASSP.2016.7472187 |
Status: | Published |
Publisher: | IEEE |
Refereed: | Yes |
Identification Number: | 10.1109/ICASSP.2016.7472187 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:102623 |