Jiang, W., Miao, Q., Lin, C. et al. (4 more authors) (2023) DisCo: Distilled Student Models Co-training for Semi-supervised Text Mining. In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track. The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 06-10 Dec 2023, Singapore. ACL , Stroudsburg, PA , pp. 4015-4030. ISBN 979-8-89176-060-8
Abstract
Many text mining models are constructed by fine-tuning a large deep pre-trained language model (PLM) in downstream tasks. However, a significant challenge nowadays is maintaining performance when we use a lightweight model with limited labelled samples. We present DisCo, a semi-supervised learning (SSL) framework for fine-tuning a cohort of small student models generated from a large PLM using knowledge distillation. Our key insight is to share complementary knowledge among distilled student cohorts to promote their SSL effectiveness. DisCo employs a novel co-training technique to optimize a cohort of multiple small student models by promoting knowledge sharing among students under diversified views: model views produced by different distillation strategies and data views produced by various input augmentations. We evaluate DisCo on both semi-supervised text classification and extractive summarization tasks. Experimental results show that DisCo can produce student models that are 7.6× smaller and 4.8× faster in inference than the baseline PLMs while maintaining comparable performance. We also show that DisCo-generated student models outperform the similar-sized models elaborately tuned in distinct tasks.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | ACL materials are Copyright © 1963–2023 ACL; other materials are copyrighted by their respective copyright holders. Materials prior to 2016 here are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License. Permission is granted to make copies for the purposes of teaching and research. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License. |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 30 Oct 2023 17:13 |
Last Modified: | 21 Dec 2023 10:13 |
Published Version: | https://aclanthology.org/2023.emnlp-main.244/ |
Status: | Published |
Publisher: | ACL |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:204676 |
Download
Filename: DisCo Distilled Student Models Co-training for.pdf
Licence: CC-BY 4.0