Fearnley, Laura Christina Anne, Cairns, Elly, Stoneham, Tom orcid.org/0000-0001-5490-4927 et al. (7 more authors) (2025) Risk of What? Defining Harm in the Context of AI Safety. [Preprint]
Abstract
For decades, the field of system safety has designed safe systems by reducing the risk of physical harm to humans, property and the environment to an acceptable level. Recently, this definition of safety has come under scrutiny by governments and researchers who argue that the narrow focus on reducing physical harm, whilst necessary, is not sufficient to secure the safety of AI systems. There is growing pressure to expand the scope of safety in the context of AI to address emerging harms, with particular emphasis being placed on the ways AI systems can reinforce and reproduce systemic harms. In this paper, we advocate for expanding the scope of conventional safety to include non-physical harms in the context of AI. However, we caution against broadening the scope to address systemic harms, as doing so presents intractable practical challenges for current safety methodologies. Instead, we propose that the scope of safety-related harms should be expanded to include psychological harms. Our proposal is partly motivated by the debates and evidence on social media, which fundamentally reshaped how harm is understood and addressed in the digital age, prompting new regulatory frameworks which aimed to protect users from the psychological risks of the technology. We draw on this precedent to motivate the inclusion of psychological harms in AI safety assessments. By expanding the scope of AI safety to include psychological harms, we take a critical step toward evolving the discipline of system safety into one that is better tuned and equipped to protect users against the complex and emerging harms propagated by AI systems.