Improving the prediction of an atmospheric chemistry transport model using gradient boosted regression trees

Abstract

Predictions from process-based models of environmental systems are biased, due to uncertainties in their inputs and parameterisations, reducing their utility. We develop a predictor for the bias in tropospheric ozone (a key pollutant) calculated by an atmospheric chemistry transport model (GEOS-Chem), based on outputs from the model and observations of ozone from both the surface (EPA, EMEP and GAW) and the ozone-sonde networks. We train a gradient-boosted decision tree algorithm (XGBoost) to predict model bias, with model and observational data for 2010–2015, and then test the approach using the years 2016–2017. We show that the bias-corrected model performs significantly better than the uncorrected model. The root mean square error is reduced from from 16.21 ppb to 7.48 ppb, the normalised mean bias is reduced from 0.28 to −0.04, and the Pearson's R is increased from 0.479 to 0.841. Comparisons with observations from the NASA ATom flights (which were not included in the training) also show improvements but to a smaller extent reducing the RMSE from 12.11 ppb to 10.50 ppb, the NMB from 0.08 to 0.06 and increasing the Pearson's R from 0.761 to 0.792. We attribute the smaller improvements to the lack of routine observational constraints of the remote troposphere. We explore the choice of predictor (bias prediction versus direct prediction) and conclude both may have utility. We show that the method is robust to variations in the volume of training data, with approximately a year of data needed to produce useful performance. Data denial experiments (removing observational sites from the algorithm training) shows that information from one location (for example Europe) can reduce the model bias over other locations (for example North America) which might provide insights into the processes controlling the model bias. We conclude that combining machine learning approaches with process based models may provide a useful tool for improving performance of air quality forecasts or to provide enhanced assessments of the impact of pollutants on human and ecosystem health, and may have utility in other environmental applications.

Metadata

Item Type:	Article
Authors/Creators:	Ivatt, Peter (pi517@york.ac.uk) Evans, Mathew John https://orcid.org/0000-0003-4775-032X
Copyright, Publisher and Additional Information:	© Author(s) 2020.
Dates:	Accepted: 9 June 2020 Published: 13 July 2020
Institution:	The University of York
Academic Units:	The University of York > Faculty of Sciences (York) > Chemistry (York)
Depositing User:	Pure (York)
Date Deposited:	03 Oct 2019 10:40
Last Modified:	20 Sep 2025 01:03
Published Version:	https://doi.org/10.5194/acp-2019-753
Status:	Published
Refereed:	Yes
Identification Number:	10.5194/acp-2019-753
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:151716

Download

Published Version

Filename: acp_20_8063_2020.pdf

Description: acp-20-8063-2020

Licence: CC-BY 2.5

CLICK TO DOWNLOAD

CORE (COnnecting REpositories)

Improving the prediction of an atmospheric chemistry transport model using gradient boosted regression trees

Abstract

Metadata

Download

Published Version

Export

Statistics