Predicting crop root concentration factors of organic contaminants with machine learning models

Abstract

Accurate prediction of uptake and accumulation of organic contaminants by crops from soils is essential to assessing human exposure via the food chain. However, traditional empirical or mechanistic models frequently show variable performance due to complex interactions among contaminants, soils, and plants. Thus, in this study different machine learning algorithms were compared and applied to predict root concentration factors (RCFs) based on a dataset comprising 57 chemicals and 11 crops, followed by comparison with a traditional linear regression model as the benchmark. The RCF patterns and predictions were investigated by unsupervised t-distributed stochastic neighbor embedding and four supervised machine learning models including Random Forest, Gradient Boosting Regression Tree, Fully Connected Neural Network, and Supporting Vector Regression based on 15 property descriptors. The Fully Connected Neural Network demonstrated superior prediction performance for RCFs (R2 = 0.79, mean absolute error [MAE] = 0.22) over other machine learning models (R2 = 0.68–0.76, MAE = 0.23–0.26). All four machine learning models performed better than the traditional linear regression model (R2 = 0.62, MAE = 0.29). Four key property descriptors were identified in predicting RCFs. Specifically, increasing root lipid content and decreasing soil organic matter content increased RCFs, while increasing excess molar refractivity and molecular volume of contaminants decreased RCFs. These results show that machine learning models can improve prediction accuracy by learning nonlinear relationships between RCFs and properties of contaminants, soils, and plants.

Metadata

Item Type:	Article
Authors/Creators:	Gao, Feng Shen, Yike Brett Sallach, J. https://orcid.org/0000-0003-4588-3364 Li, Hui Zhang, Wei (wz510@york.ac.uk) Li, Yuanbo Liu, Cun
Copyright, Publisher and Additional Information:	Funding Information: The work was supported by the National Key Research and Development Program of China , China ( 2019YFC1604503 and 2016YFD0800403 ). © 2021 Elsevier B.V. This is an author-produced version of the published paper. Uploaded in accordance with the publisher’s self-archiving policy.
Keywords:	Machine learning,Model interpretability,Organic contaminant,Plant uptake,Root concentration Factor
Dates:	Accepted: 3 October 2021 Published (online): 5 October 2021 Published: 15 February 2022
Institution:	The University of York
Academic Units:	The University of York > Faculty of Sciences (York) > Environment and Geography (York) The University of York > Faculty of Sciences (York) > Chemistry (York)
Depositing User:	Pure (York)
Date Deposited:	25 Nov 2021 14:00
Last Modified:	17 Sep 2025 02:46
Published Version:	https://doi.org/10.1016/j.jhazmat.2021.127437
Status:	Published
Refereed:	Yes
Identification Number:	10.1016/j.jhazmat.2021.127437
Related URLs:	http://www.scopus.com/inward/record.url?...
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:180871

Download

Accepted Version

Filename: Gao_JHM_SI_R1_Final_clean.docx

Description: Gao_JHM_SI_R1_Final_clean

Licence: CC-BY-NC-ND 2.5

CLICK TO DOWNLOAD

[thumbnail of Gao_JHM_SI_R1_Final_clean]

CORE (COnnecting REpositories)

Predicting crop root concentration factors of organic contaminants with machine learning models

Abstract

Metadata

Download

Accepted Version

Export

Statistics