Shapley value-based approaches to explain the quality of predictions by classifiers

Abstract

The use of algorithm-agnostic approaches for explainable machine learning is an emerging area of research. When explaining the contribution of features towards the predicted outcome, traditionally, the focus remains on explaining the prediction itself, however a little has been done on explaining the quality of prediction of these models, where the quality can be assessed by the algorithm performance when changing the thresholds for classification. In this paper, we propose the use of Shapley values to explain the contribution of features towards the overall algorithm performance, measured in terms of Receiver-operating Characteristics (ROC) curve and the Area under the ROC curve (AUC). With the help of an illustrative example, we demonstrate the proposed idea of explaining the ROC curve, and visualising the uncertainties in these curves. For imbalanced datasets, the use of Precision-Recall Curve (PRC) is considered more appropriate, therefore we also demonstrate how to explain the PRCs with the help of Shapley values. The explanation of the model performance can help analysts in a number of ways, for example, in feature selection by identifying the irrelevant features that can be removed to reduce the computational complexity. It can also help in identifying the features having critical contributions towards the overall algorithm performance.

Metadata

Item Type:	Article
Authors/Creators:	Pelegrina, G.D. Siraj, S. https://orcid.org/0000-0002-7962-9930
Copyright, Publisher and Additional Information:	© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Keywords:	explainable artificial intelligence, machine learning, business analytics
Dates:	Published: August 2024 Published (online): 21 February 2024 Accepted: 5 February 2024
Institution:	The University of Leeds
Academic Units:	The University of Leeds > Faculty of Business (Leeds) > Management Division (LUBS) (Leeds) > Management Division Decision Research (LUBS)
Depositing User:	Symplectic Publications
Date Deposited:	23 Apr 2024 10:22
Last Modified:	08 Oct 2024 01:53
Status:	Published
Publisher:	Institute of Electrical and Electronics Engineers
Identification Number:	10.1109/tai.2024.3365082
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:211725

CORE (COnnecting REpositories)

Shapley value-based approaches to explain the quality of predictions by classifiers

Abstract

Metadata

Download

Accepted Version

Export

Statistics