Fisher, O.J., Watson, N.J. orcid.org/0000-0001-5216-4873, Porcu, L. et al. (3 more authors) (2022) Data-driven modelling for resource recovery: Data volume, variability, and visualisation for an industrial bioprocess. Biochemical Engineering Journal, 185. 108499. ISSN 1369-703X
Abstract
Advances in industrial digital technologies have led to an increasing volume of data generated from industrial bioprocesses, which can be utilised within data-driven models (DDM). However, data volume and variability complications make developing models that captures the underlying biological nature of the bioprocesses challenging. In this study, a framework for developing data-driven models of bioprocesses is proposed and evaluated by modelling an industrial bioprocess, which treats industrial or agrifood wastewaters whilst simultaneously generating bioenergy. Six models were developed to predict the reduction in chemical oxygen demand from the wastewater by the bioprocess and statistically evaluated using both testing data (randomly partitioned data from the model development) and unseen data (new data not used during the model development). The statistical error metrics employed were the coefficient of determination (R2), root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE). The stacked neural network model was best able to model the bioprocess, having the highest accuracy on the testing data (R2: 0.98; RMSE: 1.29; MAE: 2.27; MAPE: 4.08) and the unseen data (R2: 0.82; RMSE: 2.57; MAE: 1.75; MAPE: 3.68). Data visualisation is used to observe (or confirm) whether new data points are within the model boundaries, helping to increase confidence in the model's predictions on future data.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2022 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
Keywords: | Data-driven models; Bioprocess; Data volume; Data variability; Data visualisation |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Environment (Leeds) > School of Food Science and Nutrition (Leeds) > FSN Nutrition and Public Health (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 10 Jul 2024 13:59 |
Last Modified: | 10 Jul 2024 13:59 |
Status: | Published |
Publisher: | Elsevier |
Identification Number: | 10.1016/j.bej.2022.108499 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:214617 |