A comparative study of using pre-trained language models for toxic comment classification

Zhao, Z., Zhang, Z. and Hopfgartner, F. orcid.org/0000-0003-0380-6088 (2021) A comparative study of using pre-trained language models for toxic comment classification. In: Leskovec, J., Grobelnik, M., Najork, M., Tang, J. and Zia, L., (eds.) Companion Proceedings of the Web Conference 2021 (WWW ’21 Companion). SocialNLP 2021 : The 9th International Workshop on Natural Language Processing for Social Media, 19 Apr 2021, Virtual conference. ACM Digital Library , pp. 500-507. ISBN 9781450383134

Abstract

As user-generated contents thrive, so does the spread of toxic comment. Therefore, detecting toxic comment becomes an active research area, and it is often handled as a text classification task. As recent popular methods for text classification tasks, pre-trained language model-based methods are at the forefront of natural language processing, achieving state-of-the-art performance on various NLP tasks. However, there is a paucity in studies using such methods on toxic comment classification. In this work, we study how to best make use of pre-trained language model-based methods for toxic comment classification and the performances of different pretrained language models on these tasks. Our results show that, Out of the three most popular language models, i.e. BERT, RoBERTa, and XLM, BERT and RoBERTa generally outperform XLM on toxic comment classification. We also prove that using a basic linear downstream structure outperforms complex ones such as CNN and BiLSTM. What is more, we find that further fine-tuning a pretrained language model with light hyper-parameter settings brings improvements to the downstream toxic comment classification task, especially when the task has a relatively small dataset.

Metadata

Item Type:	Proceedings Paper
Authors/Creators:	Zhao, Z. Zhang, Z. Hopfgartner, F. https://orcid.org/0000-0003-0380-6088
Editors:	Leskovec, J. Grobelnik, M. Najork, M. Tang, J. Zia, L.
Copyright, Publisher and Additional Information:	© 2021 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC-BY 4.0 License (http://creativecommons.org/licenses/by/4.0).
Keywords:	toxic comment; hate speech; neural networks; language model; fine-tuning; pre-training; BERT; RoBERTa; XLM
Dates:	Accepted: 16 April 2021 Published (online): 19 April 2021 Published: April 2021
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	20 May 2021 07:15
Last Modified:	09 Jun 2021 12:59
Status:	Published
Publisher:	ACM Digital Library
Refereed:	Yes
Identification Number:	10.1145/3442442.3452313
Related URLs:	Conference
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:173371

CORE (COnnecting REpositories)

A comparative study of using pre-trained language models for toxic comment classification

Abstract

Metadata

Download

Accepted Version

Export

Statistics