HERB: Measuring hierarchical regional bias in pre-trained language models

This is the latest version of this eprint.

Li, Y., Zhang, G., Yang, B. et al. (4 more authors) (2022) HERB: Measuring hierarchical regional bias in pre-trained language models. In: He, Y., Ji, H., Liu, Y., Li, S. and Chang, C.-H., (eds.) Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022. The 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, 20-23 Nov 2022, Online. Association for Computational Linguistics , pp. 334-346. ISBN 9781959429043

Abstract

Content Warning: This work contains examples that potentially implicate stereotypes, associations, and other harms that could be offensive to individuals in certain regions.

Fairness has become a trending topic in natural language processing (NLP) and covers biases targeting certain social groups such as genders and religions. Yet regional bias, another long-standing global discrimination problem, remains unexplored still. Consequently, we intend to provide a study to analyse the regional bias learned by the pre-trained language models (LMs) that are broadly used in NLP tasks. While verifying the existence of regional bias in LMs, we find that the biases on regional groups can be largely affected by the corresponding geographical clustering. We accordingly propose a hierarchical regional bias evaluation method (HERB) utilising the information from the sub-region clusters to quantify the bias in the pre-trained LMs. Experiments show that our hierarchical metric can effectively evaluate the regional bias with regard to comprehensive topics and measure the potential regional bias that can be propagated to downstream tasks. Our codes are available at https://github.com/Bernard-Yang/HERB.

Metadata

Item Type:	Proceedings Paper
Authors/Creators:	Li, Y. Zhang, G. Yang, B. Lin, C. Ragni, A. https://orcid.org/0000-0003-0634-4456 Wang, S. Fu, J.
Editors:	He, Y. Ji, H. Liu, Y. Li, S. Chang, C.-H.
Copyright, Publisher and Additional Information:	© 2022 The Association for Computational Linguistics. Licensed on a Creative Commons Attribution 4.0 International License. (https://creativecommons.org/licenses/by/4.0/)
Dates:	Published: November 2022
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	05 Jun 2024 15:48
Last Modified:	05 Jun 2024 15:48
Published Version:	https://aclanthology.org/2022.findings-aacl.32
Status:	Published
Publisher:	Association for Computational Linguistics
Refereed:	Yes
Related URLs:	Software or Code
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:213163

Available Versions of this Item

HERB: Measuring hierarchical regional bias in pre-trained language models. (deposited 05 Jun 2024 15:56)
- HERB: Measuring hierarchical regional bias in pre-trained language models. (deposited 05 Jun 2024 15:48) [Currently Displayed]

Download

Published Version

Filename: 2022.findings-aacl.32.pdf

Licence: CC-BY 4.0

CLICK TO DOWNLOAD

[thumbnail of 2022.findings-aacl.32.pdf]

CORE (COnnecting REpositories)