Baghai-Ravary, L., Beet, S.W. and Tokhi, M.O. (1995) Multi-Dimensional Coding of Speech Data. Research Report. ACSE Research Report 596 . Department of Automatic Control and Systems Engineering
Abstract
This paper presents specific new techniques for coding of speech representations and a new general approach to coding for compression, which directly utilises the multi-dimensional nature of the input data. Many methods of speech analysis yield a two-dimensional pattern, with time as one of the dimensions. Various such speech representations and power spectrum sequences in particular, are shown here to be amenable to two-dimensional compression using specific models which take account of a large part of their structure in both dimensions. Newly developed techniques, namely, Multi-step Adaptive Flux Interpolation ( MAFI) and Multi-step Flow Based Prediction (MFBP) are presented. These are able to code power spectral density (PSD) sequences of speech more completely and accurately than conventional methods and at a low computational cost. This is due to their ability to model non-stationary, piecewise-continuous, signals, of which speech is a good example. MAFI and MFBP are first applied in the time domain and then to the encoded data in the second dimension. This approach allows the coding algorithm to exploit redundancy in both dimensions, giving a significant movement in the overall compression ratio. Furthermore, the compression may be reapplied several times. The data is further compressed with each application.
Metadata
Item Type: | Monograph |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | The Department of Automatic Control and Systems Engineering research reports offer a forum for the research output of the academic staff and research students of the Department at the University of Sheffield. Papers are reviewed for quality and presentation by a departmental editor. However, the contents and opinions expressed remain the responsibility of the authors. Some papers in the series may have been subsequently published elsewhere and you are advised to cite the later published version in these instances. |
Keywords: | Adaptive flux interpolation; Flow based prediction; Speech coding. |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Automatic Control and Systems Engineering (Sheffield) > ACSE Research Reports |
Depositing User: | MRS ALISON THERESA BARNETT |
Date Deposited: | 08 Aug 2014 08:35 |
Last Modified: | 01 Nov 2016 01:02 |
Status: | Published |
Publisher: | Department of Automatic Control and Systems Engineering |
Series Name: | ACSE Research Report 596 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:80063 |