Can you trust your Agent? The Effect of Out-of-Distribution Detection on the Safety of Reinforcement Learning Systems

Haider, Tom, Roscher, Karsten, Herd, Benjamin et al. (2 more authors) (2024) Can you trust your Agent? The Effect of Out-of-Distribution Detection on the Safety of Reinforcement Learning Systems. In: 39th Annual ACM Symposium on Applied Computing, SAC 2024. 39th Annual ACM Symposium on Applied Computing, SAC 2024, 08-12 Apr 2024 Proceedings of the ACM Symposium on Applied Computing. Association for Computing Machinery, Inc, ESP, pp. 1569-1578.

Abstract

Deep Reinforcement Learning (RL) has the potential to revolutionize the automation of complex sequential decision-making problems. Although it has been successfully applied to a wide range of tasks, deployment to real-world settings remains challenging and is often limited. One of the main reasons for this is the lack of safety guarantees for conventional RL algorithms, especially in situations that substantially differ from the learning environment. In such situations, state-of-the-art systems will fail silently, producing action sequences without signalizing any uncertainty regarding the current input. Recent works have suggested Out-of-Distribution (OOD) detection as an additional reliability measure when deploying RL in the real world. How these mechanisms benefit the safety of the entire system, however, is not yet fully understood. In this work, we study how OOD detection contributes to the safety of RL systems by describing the challenges involved with detecting unknown situations. We derive several definitions for unknown events and explore potential avenues for a successful safety argumentation, building on recent work for safety assurance of Machine Learning components. In a series of experiments, we compare different OOD detectors and show how difficult it is to distinguish harmless from potentially unsafe OOD events in practice, and how standard evaluation schemes can lead to deceptive conclusions, depending on which definition of unknown is applied.

Metadata

Item Type:	Proceedings Paper
Authors/Creators:	Haider, Tom Roscher, Karsten Herd, Benjamin Schmoeller Roza, Felippe Burton, Simon https://orcid.org/0000-0001-9040-8752
Copyright, Publisher and Additional Information:	Publisher Copyright: © 2024 Copyright held by the owner/author(s).
Keywords:	AI safety,anomaly detection,OOD detection,reinforcement learning,sequential decision making
Dates:	Published: 8 April 2024
Institution:	The University of York
Academic Units:	The University of York > Faculty of Sciences (York) > Computer Science (York)
Date Deposited:	06 Aug 2025 09:40
Last Modified:	11 Feb 2026 00:07
Published Version:	https://doi.org/10.1145/3605098.3635931
Status:	Published
Publisher:	Association for Computing Machinery, Inc
Series Name:	Proceedings of the ACM Symposium on Applied Computing
Identification Number:	10.1145/3605098.3635931
Related URLs:	http://www.scopus.com/inward/record.url?...
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:230127

Download

Published Version

Filename: 3605098.3635931.pdf

Description: Can you trust your Agent? The Effect of Out-of-Distribution Detection on the Safety of Reinforcement Learning Systems

Licence: CC-BY 2.5

CLICK TO DOWNLOAD

[thumbnail of Can you trust your Agent? The Effect of Out-of-Distribution Detection on the Safety of Reinforcement Learning Systems]

CORE (COnnecting REpositories)

Can you trust your Agent? The Effect of Out-of-Distribution Detection on the Safety of Reinforcement Learning Systems

Abstract

Metadata

Download

Published Version

Export

Statistics