Wareing, C. orcid.org/0000-0001-9641-0861, Roy, A.T., Golden, M. et al. (2 more authors) (2025) Data-driven discovery of the equations of turbulent convection. Geophysical and Astrophysical Fluid Dynamics. ISSN 0309-1929
Abstract
We compare the efficiency and ease-of-use of the Sparse Identification of Nonlinear Dynamics (SINDy) algorithm and Sparse Physics-Informed Discovery of Empirical Relations (SPIDER) framework in recovering the relevant governing equations and boundary conditions from data generated by direct numerical simulations (DNS) of turbulent convective flows. In the former case, a weak-form implementation pySINDy is used. Time-dependent data for two- (2D) and three-dimensional (3D) DNS simulation of Rayleigh-Bénard convection and convective plane Couette flow is generated using the Dedalus PDE framework for spectrally solving differential equations. Using pySINDy we are able to recover the governing equations of 2D models of Rayleigh-Bénard convection at Rayleigh numbers, R, from laminar, through transitional to moderately turbulent flow conditions, albeit with increasing difficulty with larger Rayleigh number, especially in recovery of the diffusive terms (with coefficient magnitude proportional to √1/�). SPIDER requires a much smaller library of terms and we are able to recover more easily the governing equations for a wider range of R in 2D and 3D convection and plane flow models and go on to recover constraints (the incompressibility condition) and boundary conditions, demonstrating the benefits and capabilities of SPIDER to go beyond pySINDy for these fluid problems governed by second-order PDEs. At the highest values of R, discrepancies appear between the governing equations that are solved and those that are discovered by SPIDER. We find that this is likely associated with limited resolution of DNS, demonstrating the potential of machine-learning methods to validate numerical solvers and solutions for such flow problems. We also find that properties of the flow, specifically the correlation time and spatial scales, should inform the initial selection of spatiotemporal subdomain sizes for both pySINDy and SPIDER. Adopting this default position has the potential to reduce trial and error in selection of data parameters, saving considerable time and effort and allowing the end user of these or similar methods to focus on the importance of setting the power of the integrating polynomial in these weak-form methods and the tolerance of the optimiser technique selected.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2025 The Author(s). This is an open access article under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited. |
Keywords: | Boundary conditions; Navier Stokes equations; machine learning; data-driven techniques; sparse regression |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Mathematics (Leeds) > Applied Mathematics (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 07 Jul 2025 11:18 |
Last Modified: | 07 Jul 2025 11:18 |
Status: | Published online |
Publisher: | Taylor & Francis |
Identification Number: | 10.1080/03091929.2025.2509469 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:228662 |
Download
Filename: Data-driven discovery of the equations of turbulent convection.pdf
Licence: CC-BY 4.0