Toraskar, S., Khan, A., Niranjannaik, M. et al. (2 more authors) (2025) AutoML-Fire: Automated machine-learning approach to predict forest fires. Environmental Modelling & Software, 193. 106578. ISSN: 1364-8152
Abstract
Forest fires pose a serious threat to the environment. Their frequency and intensity have increased in recent decades due to climate change and heightened anthropogenic interference. About 21.7% of India’s total land is covered by forests, which play a crucial role in biodiversity, ecology, and the livelihoods dependent on these ecosystems. However, 36% of these forested regions are prone to frequent and devastating fire. Given this vulnerability, predicting forest fires is essential for minimising damage. In this first pan-India study, we predict the forest fire occurrence in the most vulnerable regions across India using a dataset spanning from 2003 to 2018, incorporating variables such as cloud cover, elevation, forest cover fraction, humidity, NDVI, population, soil moisture, temperature, wind speed, and precipitation. We partitioned the data into four clusters based on spatial proximity to capture regional patterns. SHAP (SHapley Additive exPlanations) values were utilised to enhance model interpretability and provide insights into socio-technical complexities unique to different regions. We analysed the Partial Dependence Plots (PDP) to capture the trend of forest fires with individual features. The challenge of data imbalance, often encountered in natural hazard prediction, was addressed using the Synthetic Minority Over-Sampling Technique for Regression with Gaussian Noise (SMOGN) algorithm, which balances regression data. Selecting appropriate machine-learning models and adeptly tuning their hyperparameters is a complex process that requires domain expertise. To address this, we proposed an automated machine-learning (AutoML) framework that utilises Bayesian optimisation to return a best-performing, finely-tuned model. The “AutoML-FIRE” model exhibited robust performance, with R values between 0.73 and 0.85 and RMSE values ranging from 3.40 to 6.09, outperforming all considered benchmarking algorithms. Furthermore, uncertainty analysis and spatial distribution analysis were conducted to validate the model’s stability. Our analysis demonstrates that the AutoML-FIRE model is robust, exhibits broad applicability for national-scale fire risk assessment, and enables notifications to authorities and local communities regarding impending fire events.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Keywords: | Forest fire, Automated machine learning, Bayesian optimisation, SHAP, SMOGN |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Mathematics (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 22 Sep 2025 11:43 |
Last Modified: | 22 Sep 2025 11:43 |
Status: | Published |
Publisher: | Elsevier |
Identification Number: | 10.1016/j.envsoft.2025.106578 |
Related URLs: | |
Sustainable Development Goals: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:231966 |