Evaluating Metrics for Safety with LLM-as-Judges

This is a preprint and may not have undergone formal peer review

Clegg, Kester orcid.org/0000-0002-4484-3291, Hawkins, Richard orcid.org/0000-0001-7347-3413, Habli, Ibrahim orcid.org/0000-0003-2736-8238 et al. (1 more author) (2025) Evaluating Metrics for Safety with LLM-as-Judges. [Preprint]

Abstract

Metadata

Item Type: Preprint
Authors/Creators:
Keywords: cs.CL,cs.AI
Dates:
  • Published: 17 December 2025
Institution: The University of York
Academic Units: The University of York > Faculty of Sciences (York) > Computer Science (York)
Date Deposited: 11 Mar 2026 17:00
Last Modified: 06 May 2026 05:28
Published Version: https://doi.org/10.48550/arXiv.2512.15617
Status: Published
Publisher: arXiv
Identification Number: 10.48550/arXiv.2512.15617
Open Archives Initiative ID (OAI ID):

Download

Filename: 2512.15617v1.pdf

Description: 2512.15617v1

Export

Statistics