Jiang, W., Miao, Q., Lin, C. et al. (4 more authors) (2023) DisCo: Distilled Student Models Co-training for Semi-supervised Text Mining. In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track. The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 06-10 Dec 2023, Singapore. ACL , Stroudsburg, PA , pp. 4015-4030. ISBN: 979-8-89176-060-8
Abstract
Many text mining models are constructed by fine-tuning a large deep pre-trained language model (PLM) in downstream tasks. However, a significant challenge nowadays is maintaining performance when we use a lightweight model with limited labelled samples. We present DisCo, a semi-supervised learning (SSL) framework for fine-tuning a cohort of small student models generated from a large PLM using knowledge distillation. Our key insight is to share complementary knowledge among distilled student cohorts to promote their SSL effectiveness. DisCo employs a novel co-training technique to optimize a cohort of multiple small student models by promoting knowledge sharing among students under diversified views: model views produced by different distillation strategies and data views produced by various input augmentations. We evaluate DisCo on both semi-supervised text classification and extractive summarization tasks. Experimental results show that DisCo can produce student models that are 7.6× smaller and 4.8× faster in inference than the baseline PLMs while maintaining comparable performance. We also show that DisCo-generated student models outperform the similar-sized models elaborately tuned in distinct tasks.
Metadata
| Item Type: | Proceedings Paper | 
|---|---|
| Authors/Creators: | 
 | 
| Copyright, Publisher and Additional Information: | ACL materials are Copyright © 1963–2023 ACL; other materials are copyrighted by their respective copyright holders. Materials prior to 2016 here are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License. Permission is granted to make copies for the purposes of teaching and research. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License. | 
| Dates: | 
 | 
| Institution: | The University of Leeds | 
| Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) | 
| Depositing User: | Symplectic Publications | 
| Date Deposited: | 30 Oct 2023 17:13 | 
| Last Modified: | 21 Dec 2023 10:13 | 
| Published Version: | https://aclanthology.org/2023.emnlp-main.244/ | 
| Status: | Published | 
| Publisher: | ACL | 
| Related URLs: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:204676 | 
Download
Filename: DisCo Distilled Student Models Co-training for.pdf
Licence: CC-BY 4.0

 CORE (COnnecting REpositories)
 CORE (COnnecting REpositories) CORE (COnnecting REpositories)
 CORE (COnnecting REpositories)