Turner, Daniel and Murphy, Damian T. orcid.org/0000-0002-6676-9459 (2024) A deep learning approach to the prediction of time-frequency spatial parameters for use in stereo upmixing. In: Proceedings of the 27th International Conference on Digital Audio Effects (DAFx24) Guildford, Surrey, UK, September 3-7, 2024. 27th International Conference on Digital Audio Effects, DAFx 2024, 03-07 Sep 2024 DAFx , GBR , pp. 428-435.
Abstract
This paper presents a deep learning approach to parametric time-frequency parameter prediction for use within stereo upmixing algorithms. The approach presented uses a Multi-Channel U-Net with Residual connections (MuCh-Res-U-Net) trained on a novel dataset of stereo and parametric time-frequency spatial audio data to predict time-frequency spatial parameters from a stereo input signal for positions on a 50-point Lebedev quadrature sampled sphere. An example upmix pipeline is then proposed which utilises the predicted time-frequency spatial parameters to both extract and remap stereo signal components to target spherical harmonic components to facilitate the generation of a full spherical representation of the upmixed sound field.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2024 Daniel Turner et al. |
Dates: |
|
Institution: | The University of York |
Academic Units: | The University of York > Faculty of Sciences (York) > Electronic Engineering (York) |
Depositing User: | Pure (York) |
Date Deposited: | 11 Jun 2025 15:30 |
Last Modified: | 11 Jun 2025 15:30 |
Status: | Published |
Publisher: | DAFx |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:227755 |
Download
Filename: DAFx24_paper_82.pdf
Description: A deep learning approach to the prediction of time-frequency spatial parameters for use in stereo upmixing
Licence: CC-BY 2.5