Causal Explanations from the Geometric Properties of ReLU Neural Networks

Abstract

Neural networks have proved an effective means of learning control policies for autonomous systems, such as Maritime Autonomous Surface Ships (MASS), but these learned policies are difficult to understand due to the black-box nature of neural networks. This lack of interpretability makes safety assurance for such autonomous systems challenging. The fields of eXplainable Artificial Intelligence (XAI) and eXplainable Reinforcement Learning (XRL) aim to interpret the decision-making processes of neural networks and autonomous agents respectively. In particular, work on causal explanations aims to provide ``why" and ``why not" explanations for why a model made a given decision. However, most work on explainability to date utilizes a distilled version of the original model. While this distilled policy is interpretable, it necessarily degrades in performance when compared to the original model, and is not guaranteed to be an accurate reflection of the decision-making processes in the original model, and as such cannot be used to guarantee its safety. Recent work on understanding the geometry of ReLU neural networks shows that a ReLU network corresponds to a piecewise linear function divided into regions defined by an n-dimensional convex polytope. Through this lens, a neural network can be understood as dividing the input space into distinct regions which apply a single linear function for each output neuron. We show that this geometric representation can be used to generate causal explanations for the network's behaviour similar to previous work, but which extracts rules directly from the geometry of Neural Networks with the ReLU activation function, and is therefore an accurate reflection of the network's behaviour.

Metadata

Item Type:	Conference or Workshop Item
Authors/Creators:	Woods, Hector Ryan, Philippa Mary https://orcid.org/0000-0003-1307-5207 Alexander, Rob https://orcid.org/0000-0003-3818-0310
Dates:	Published: 2025
Institution:	The University of York
Academic Units:	The University of York > Faculty of Sciences (York) > Computer Science (York)
Date Deposited:	08 Oct 2025 11:50
Last Modified:	12 Dec 2025 16:54
Status:	Published
Refereed:	Yes
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:232693

Download

Accepted Version

Filename: YISEC_Paper.pdf

Description: YISEC Paper Hector Woods

Licence: Creative Commons: Public Domain Dedication

CLICK TO DOWNLOAD

CORE (COnnecting REpositories)

Causal Explanations from the Geometric Properties of ReLU Neural Networks

Abstract

Metadata

Download

Accepted Version

Export

Statistics