Ng, R., Chettri, B. and Hain, T. orcid.org/0000-0003-0939-3464 (2016) Combining weak tokenisers for phonotactic language recognition in a resource-constrained setting. In: Interspeech 2016. Interspeech, 09-12 Sep 2016, San Francisco, CA. ISCA , pp. 2939-2943.
Abstract
In the phonotactic approach for language recognition, a phone tokeniser is normally used to transform the audio signal into acoustic tokens. The language identity of the speech is modelled by the occurrence statistics of the decoded tokens. The performance of this approach depends heavily on the quality of the audio tokeniser. A high-quality tokeniser in matched condition is not always available for a language recognition task. This study investigated into the performance of a phonotactic language recogniser in a resource-constrained setting, following NIST LRE 2015 specification. An ensemble of phone tokenisers was constructed by applying unsupervised sequence training on different target languages followed by a score-based fusion. This method gave 5−7% relative performance improvement to baseline system on LRE 2015 eval set. This gain was retained when the ensemble phonotactic system was further fused with an acoustic iVector system
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2016 ISCA |
Keywords: | Language recognition; phonotactics; multilingual adaptation |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Funding Information: | Funder Grant number ENGINEERING AND PHYSICAL SCIENCE RESEARCH COUNCIL (EPSRC) UNSPECIFIED |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 03 Jan 2017 10:27 |
Last Modified: | 28 Jul 2017 12:59 |
Published Version: | http://dx.doi.org/10.21437/Interspeech.2016-630 |
Status: | Published |
Publisher: | ISCA |
Refereed: | Yes |
Identification Number: | 10.21437/Interspeech.2016-630 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:109213 |