Walter, M., Webb, S.J. and Gillet, V.J. orcid.org/0000-0002-8403-3111 (2024) Interpreting neural network models for toxicity prediction by extracting learned chemical features. Journal of Chemical Information and Modeling, 64 (9). pp. 3670-3688. ISSN 1549-9596
Abstract
Neural network models have become a popular machine learning technique for toxicity prediction of chemicals. However, due to their complex structure, it is difficult to understand predictions made by these models which limits confidence. Current techniques to tackle this problem such as SHAP or integrated gradients provide insights by attributing importance to input features of individual compounds. While these methods have produced promising results in some cases, they do not shed light on how representations of compounds are transformed in hidden layers, which constitutes how neural networks learn. We present a novel technique to interpret neural networks which identifies chemical substructures in training data found to be responsible for the activation of hidden neurons. For individual test compounds, the importance of hidden neurons is determined, and the associated substructures are leveraged to explain the model prediction. Using structural alerts for mutagenicity from the Derek Nexus expert system as a ground truth, we demonstrate the validity of the approach and show that model explanations in some cases are complementary to explanations obtained from an established feature attribution method.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2024 The Authors. Published by American Chemical Society. This publication is licensed under CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/). |
Keywords: | Aromatic compounds; Chemical structure; Layers; Mathematical methods; Toxicity |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield) |
Funding Information: | Funder Grant number LHASA LIMITED UNSPECIFIED |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 29 Apr 2024 14:38 |
Last Modified: | 20 May 2024 08:54 |
Status: | Published |
Publisher: | American Chemical Society |
Refereed: | Yes |
Identification Number: | 10.1021/acs.jcim.4c00127 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:211994 |