LexGLUE : a benchmark dataset for legal language understanding in English

Abstract

Law, interpretations of law, legal arguments, agreements, etc. are typically expressed in writing, leading to the production of vast corpora of legal text. Their analysis, which is at the center of legal practice, becomes increasingly elaborate as these collections grow in size. Natural language understanding (NLU) technologies can be a valuable tool to support legal practitioners in these endeavors. Their usefulness, however, largely depends on whether current state-of-the-art models can generalize across various tasks in the legal domain. To answer this currently open question, we introduce the Legal General Language Understanding Evaluation (LexGLUE) benchmark, a collection of datasets for evaluating model performance across a diverse set of legal NLU tasks in a standardized way. We also provide an evaluation and analysis of several generic and legal-oriented models demonstrating that the latter consistently offer performance improvements across multiple tasks.

Metadata

Item Type:	Article
Authors/Creators:	Chalkidis, I. Jana, A. Hartung, D. Bommarito, M. Androutsopoulos, I. Katz, D.M. Aletras, N. https://orcid.org/0000-0003-4285-1965
Copyright, Publisher and Additional Information:	© 2021 The Authors. Preprint available under a Creative Commons Attribution Licence (http://creativecommons.org/licenses/by/4.0).
Dates:	Submitted: 3 October 2021
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	06 Oct 2021 10:11
Last Modified:	06 Oct 2021 10:11
Published Version:	https://arxiv.org/abs/2110.00976v1
Status:	Submitted
Related URLs:	arXiv URL
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:178919

Download

Submitted Version

Filename: 2110.00976v1.pdf

Licence: CC-BY 4.0

CLICK TO DOWNLOAD

CORE (COnnecting REpositories)

LexGLUE : a benchmark dataset for legal language understanding in English

Abstract

Metadata

Download

Submitted Version

Export

Statistics