Items where authors include "Scarton, C."
Article
Leite, J.A., Razuvayevskaya, O., Bontcheva, K. et al. (1 more author) (2025) Weakly supervised veracity classification with LLM-predicted credibility signals. EPJ Data Science, 14. 16. ISSN 2193-1127
Razuvayevskaya, O. orcid.org/0000-0002-7922-7982, Wu, B., Leite, J.A. et al. (5 more authors) (2024) Comparison between parameter-efficient techniques and full fine-tuning: a case study on multilingual news article classification. PLoS ONE, 19 (5). e0301738. ISSN 1932-6203
Singh, I. orcid.org/0000-0002-3788-3295, Scarton, C. orcid.org/0000-0002-0103-4072 and Bontcheva, K. (2023) UTDRM: unsupervised method for training debunked-narrative retrieval models. EPJ Data Science, 12 (1). 59. ISSN 2193-1127
Gow-Smith, E., Madabushi, H.T., Scarton, C. orcid.org/0000-0002-0103-4072 et al. (1 more author) (2022) Improving tokenisation by alternative treatment of spaces. arXiv. (Submitted)
Alva-Manchego, F., Scarton, C. and Specia, L. (2021) The (un)suitability of automatic evaluation metrics for text simplification. Computational Linguistics, 47 (4). pp. 861-889. ISSN 0891-2017
Singh, I., Scarton, C. orcid.org/0000-0002-0103-4072 and Bontcheva, K. orcid.org/0000-0001-6152-9600 (2021) Multistage BiCross encoder for multilingual access to COVID-19 health information. PLoS ONE, 16 (9). e0256874.
Singh, I., Bontcheva, K. orcid.org/0000-0001-6152-9600 and Scarton, C. orcid.org/0000-0002-0103-4072 (2021) The false COVID-19 narratives that keep being debunked : a spatiotemporal analysis. arXiv. (Submitted)
Jiang, Y., Song, X., Scarton, C. orcid.org/0000-0002-0103-4072 et al. (2 more authors) (2021) Categorising fine-to-coarse grained misinformation : an empirical study of COVID-19 infodemic. arXiv. (Submitted)
Alva-Manchego, F. orcid.org/0000-0001-6218-8377, Martin, L., Bordes, A. et al. (3 more authors) (2020) ASSET : a dataset for tuning and evaluation of sentence simplification models with multiple rewriting transformations. arXiv. (Submitted)
Alva-Manchego, F., Scarton, C. orcid.org/0000-0002-0103-4072 and Specia, L. (2020) Data-driven sentence simplification: Survey and benchmark. Computational Linguistics, 46 (1). pp. 135-187. ISSN 0891-2017
Scarton, C. orcid.org/0000-0002-0103-4072 (2019) Horacio Saggion, automatic text simplification. Synthesis lectures on human language technologies, April 2017. 137 pages, ISBN:1627058680 9781627058681. Natural Language Engineering, 26 (4). pp. 489-492. ISSN 1351-3249
Scarton, C., Forcada, M.L., Esplà-Gomis, M. et al. (1 more author) (2019) Estimating post-editing effort : a study on human judgements, task-based and reference-based metrics of MT quality. arXiv. (Submitted)
Toledo, C.M., Cunha, A., Scarton, C. et al. (1 more author) (2014) Automatic classification of written descriptions by healthy adults: An overview of the application of natural language processing and machine learning techniques to clinical discourse analysis. Dement Neuropsychol, 8 (3). pp. 227-235. ISSN 1980-5764
Proceedings Paper
Zareie, A., Bontcheva, K. orcid.org/0000-0001-6152-9600 and Scarton, C. orcid.org/0000-0002-0103-4072 (2025) A lightweight approach for user and keyword classification in controversial topics. In: Maria Aiello, L., Chakraborty, T. and Gaito, S., (eds.) Social Networks Analysis and Mining (ASONAM 2024). The 16th International Conference on Advances in Social Networks Analysis and Mining - ASONAM-2024, 02-05 Sep 2024, Rende, Italy. Lecture Notes in Computer Science, 15212 (1). Springer Nature Switzerland , pp. 243-253. ISBN 9783031785375
Leite, J.A. orcid.org/0000-0002-3587-853X, Razuvayevskaya, O. orcid.org/0000-0002-7922-7982, Bontcheva, K. orcid.org/0000-0001-6152-9600 et al. (1 more author) (2024) EUvsDisinfo: a dataset for multilingual detection of pro-Kremlin disinformation in news articles. In: CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management. 33rd ACM International Conference on Information and Knowledge Management, 21-25 Oct 2024, Boise, Idaho, USA. Association for Computing Machinery , pp. 5380-5384. ISBN 9798400704369
Vasilakes, J., Zhao, Z. orcid.org/0000-0002-3060-269X, Vykopal, I. et al. (3 more authors) (2024) ExU: AI models for examining multilingual disinformation narratives and understanding their spread. In: Scarton, C., Prescott, C., Bayliss, C., Oakley, C., Wright, J., Wrigley, S., Song, X., Gow-Smith, E., Forcada, M. and Moniz, H.L., (eds.) Proceedings of the 25th Annual Conference of the European Association for Machine Translation, EAMT 2024. 25th Annual Conference of the European Association for Machine Translation, 24-27 Jun 2024, Sheffield, United Kingdom. European Association for Machine Translation (EAMT) , pp. 39-40. ISBN 9781068690716
Mu, Y., Wu, B.P., Thorne, W. orcid.org/0000-0002-8947-6261 et al. (5 more authors) (2024) Navigating prompt complexity for zero-shot classification: a study of large language models in computational social science. In: Calzolari, N., Kan, M-Y., Hoste, V., Lenci, A., Sakti, S. and Xue, N., (eds.) Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 20-25 May 2024, Torino, Italy. ELRA and ICCL , pp. 12074-12086. ISBN 978-2-493814-10-4
Wu, B., Li, Y., Mu, Y. et al. (3 more authors) (2023) Don’t waste a single annotation: improving single-label classifiers through soft labels. In: Findings of the Association for Computational Linguistics: EMNLP 2023. 2023 Conference on Empirical Methods in Natural Language Processing, 06-10 Dec 2023, Singapore. Association for Computational Linguistics , pp. 5347-5355. ISBN 979-8-89176-061-5
Jiang, Y., Song, X. orcid.org/0000-0002-4188-6974, Scarton, C. orcid.org/0000-0002-0103-4072 et al. (3 more authors) (2023) Categorising fine-to-coarse grained misinformation: an empirical study of the COVID-19 infodemic. In: Mitkov, R. and Angelova, G., (eds.) Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing. 14th International Conference on Recent Advances in Natural Language Processing, 08-10 Sep 2023, Varna, Bulgaria. INCOMA Ltd., Shoumen, BULGARIA , pp. 556-567. ISBN 978-954-452-092-2
Li, Y., Scarton, C. orcid.org/0000-0002-0103-4072, Song, X. orcid.org/0000-0002-4188-6974 et al. (1 more author) (2023) Classifying COVID-19 vaccine narratives. In: Mitkov, R. and Angelova, G., (eds.) Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing. 14th International Conference on Recent Advances in Natural Language Processing (RANLP 2023), 04-06 Sep 2023, Varna, Bulgaria. INCOMA Ltd., Shoumen , Bulgaria , pp. 648-657. ISBN 978-954-452-092-2
Vincent, S., Flynn, R. and Scarton, C. orcid.org/0000-0002-0103-4072 (2023) MTCue: learning zero-shot control of extra-textual attributes by leveraging unstructured context in neural machine translation. In: Findings of the Association for Computational Linguistics: ACL 2023. Findings of the Association for Computational Linguistics: ACL 2023, 09-14 Jul 2023, Toronto, Canada. Association for Computational Linguistics , pp. 8210-8226. ISBN 9781959429623
Mu, Y., Jin, M., Grimshaw, C. et al. (3 more authors) (2023) VaxxHesitancy: A dataset for studying hesitancy towards COVID-19 vaccination on Twitter. In: Lin, Y.-R., Cha, M. and Quercia, D., (eds.) Proceedings of the International AAAI Conference on Web and Social Media. Seventeenth International AAAI Conference on Web and Social Media, 05-08 Jun 2023, Limassol, Cyprus. Association for the Advancement of Artificial Intelligence (AAAI) , pp. 1052-1062. ISBN 9781577358794
Singh, I., Bontcheva, K. orcid.org/0000-0001-6152-9600, Song, X. et al. (1 more author) (2022) Comparative analysis of engagement, themes, and causality of Ukraine-related debunks and disinformation. In: Hopfgartner, F., Jaidka, K., Mayr, P., Jose, J. and Breitsohl, J., (eds.) Social Informatics: 13th International Conference, SocInfo 2022, Glasgow, UK, October 19–21, 2022, Proceedings. 13th International Conference, SocInfo 2022, 19-21 Oct 2022, Glasgow, UK. Lecture Notes in Computer Science . Springer International Publishing , pp. 128-143. ISBN 9783031190964
Madabushi, H.T., Gow-Smith, E., Garcia, M. et al. (3 more authors) (2022) SemEval-2022 Task 2 : multilingual idiomaticity detection and sentence embedding. In: Emerson, G., Schluter, N., Stanovsky, G., Kumar, R., Palmer, A., Schneider, N., Singh, S. and Ratan, S., (eds.) Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022). 16th International Workshop on Semantic Evaluation (SemEval-2022), 14-15 Jul 2022, Seattle, WA, USA. Association for Computational Linguistics (ACL) , pp. 107-121. ISBN 9781955917803
Phelps, D., Fan, X.-R., Gow-Smith, E. et al. (3 more authors) (2022) Sample efficient approaches for idiomaticity detection. In: Bhatia, A., Cook, P., Taslimipoor, S., Garcia, M. and Ramisch, C., (eds.) Proceedings of The 18th Workshop on Multiword Expressions @LREC2022. The 18th Workshop on Multiword Expressions @LREC2022, 20-25 Jun 2022, Marseille, France. European Language Resources Association (ELRA) , pp. 105-111. ISBN 9791095546900
Tayyar Madabushi, H., Gow-Smith, E., Scarton, C. et al. (1 more author) (2021) AStitchInLanguageModels : dataset and methods for the exploration of idiomaticity in pre-trained language models. In: Moens, M.-F., Huang, X., Specia, L. and Yih, S.W.-T., (eds.) Findings of the Association for Computational Linguistics: EMNLP 2021. Findings of the Association for Computational Linguistics: EMNLP 2021, 07-11 Nov 2021, Punta Cana, Dominican Republic. Association for Computational Linguistics , pp. 3464-3477. ISBN 9781955917100
Garcia, M., Kramer Vieira, T., Scarton, C. et al. (2 more authors) (2021) Assessing idiomaticity representations in vector models with a noun compound dataset labeled at type and token levels. In: Proceedings of ACL-IJCNLP 2021. Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 01-06 Aug 2021, Bangkok, Thailand. Association for Computational Linguistics (ACL) , pp. 2730-2741. ISBN 978-1-954085-52-7
Garcia, M., Vieira, T.K., Scarton, C. et al. (2 more authors) (2021) Probing for idiomaticity in vector space models. In: Merlo, P., Tiedemann, J. and Tsarfaty, R., (eds.) Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics. 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2021), 19-23 Apr 2021, Virtual conference. Association for Computational Linguistics (ACL) , pp. 3551-3564.
Scarton, C. orcid.org/0000-0002-0103-4072, Silva, D.F. and Bontcheva, K. (2020) Measuring what counts : the case of rumour stance classification. In: Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing. 10th International Joint Conference on Natural Language Processing - AACL-IJCNLP 2020, 04-07 Dec 2020, Suzhou, China (online). Association for Computational Linguistics (ACL) , pp. 925-932. ISBN 9781952148910
Leite, J.A., Silva, D.F., Bontcheva, K. et al. (1 more author) (2020) Toxic language detection in social media for Brazilian Portuguese : new dataset and multilingual analysis. In: Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing. 10th International Joint Conference on Natural Language Processing - AACL-IJCNLP 2020, 04-07 Dec 2020, Suzhou, China (online). Association for Computational Linguistics (ACL) , pp. 914-924. ISBN 9781952148910
Wick-Pedro, G. orcid.org/0000-0002-7332-4482, Santos, R.L.S., Vale, O.A. orcid.org/0000-0002-0091-8079 et al. (3 more authors) (2020) Linguistic analysis model for monitoring user reaction on satirical news for Brazilian Portuguese. In: Quaresma, P., Vieira, R., Aluísio, S.M., Moniz, H., Batista, F. and Gonçalves, T., (eds.) Computational Processing of the Portuguese Language. 14th International Conference, PROPOR 2020, 02-04 Mar 2020, Evora, Portugal. Lecture Notes in Computer Science, 12037 . Springer International Publishing , pp. 313-320. ISBN 9783030415044
Scarton, C. orcid.org/0000-0002-0103-4072, Madhyastha, P. and Specia, L. orcid.org/0000-0002-5495-3128 (2020) Deciding when, how and for whom to simplify. In: Giacomo, G.D., Catalá, A., Dilkina, B., Milano, M., Barro, S., Bugarín, A. and Lang, J., (eds.) ECAI 2020. ECAI 2020 : 24th European Conference on Artificial Intelligence , 29 Aug - 08 Sep 2020, Santiago de Compostela, Spain. IOS Press , pp. 2172-2179. ISBN 9781643681009
Alva-Manchego, F., Martin, L., Scarton, C. et al. (1 more author) (2019) EASSE: easier automatic sentence simplification evaluation. In: Padó,, S. and Huang, R., (eds.) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations. The 2019 Conference on Empirical Methods in Natural Language Processing And the 9th International Joint Conference on Natural Language Processing, 03-07 Nov 2019, Hong Kong, China. Association for Computational Linguistics , pp. 49-54. ISBN 9781950737925
Forcada, M.L., Scarton, C., Specia, L. orcid.org/0000-0002-5495-3128 et al. (2 more authors) (2018) Exploring gap filling as a cheaper alternative to reading comprehension questionnaires when evaluating machine translation for gisting. In: Proceedings of the Third Conference on Machine Translation. Third Conference on Machine Translation (WMT18), 31 Oct - 01 Nov 2018, Brussels, Belgium. ACL , pp. 192-203.
Specia, L. orcid.org/0000-0002-5495-3128, Paetzold, G.H. and Scarton, C. orcid.org/0000-0002-0103-4072 (2015) Multi-level translation quality prediction with QuEst++. In: Proceedings of ACL-IJCNLP 2015 System Demonstrations. The 53rd Annual Meeting of the Association for Computational Linguistics and The 7th International Joint Conference on Natural Language Processing, 26 Jul 2015 - 31 Jul 2016, Beijing, China. Association for Computational Linguistics (ACL) , pp. 115-120. ISBN 9781941643990
Book
Specia, L. orcid.org/0000-0002-5495-3128, Scarton, C. and Paetzold, G.H. (2018) Quality estimation for machine translation. Synthesis Lectures On Human Language Technologies, 39 . Morgan & Claypool Publishers . ISBN 9781681733753
Preprint
Vasilakes, J., Scarton, C. and Zhao, Z. orcid.org/0000-0002-3060-269X (2025) Exploring vision language models for multimodal and multilingual stance detection. [Preprint] (Submitted)
Leite, J.A., Razuvayevskaya, O., Scarton, C. et al. (1 more author) (2024) A cross-domain study of the use of persuasion techniques in online disinformation. [Preprint] (Submitted)
Srba, I., Razuvayevskaya, O., Leite, J.A. et al. (10 more authors) (2024) A survey on automatic credibility assessment of textual credibility signals in the era of large language models. [Preprint] (Submitted)
Leite, J.A., Razuvayevskaya, O., Bontcheva, K. orcid.org/0000-0001-6152-9600 et al. (1 more author) (2023) Detecting misinformation with LLM-predicted credibility signals and weak supervision. [Preprint] (Submitted)
Singh, I., Scarton, C. orcid.org/0000-0002-0103-4072, Song, X. orcid.org/0000-0002-4188-6974 et al. (1 more author) (2023) Finding already debunked narratives via multistage retrieval: enabling cross-lingual, cross-dataset and zero-shot learning. [Preprint] (Submitted)