Gruber, M., Roslan, M.F., Parry, O. et al. (3 more authors) (2024) Do automatic test generation tools generate flaky tests? In: ICSE '24: Proceedings of the 46th IEEE/ACM International Conference on Software Engineering. 46th International Conference on Software Engineering (ICSE 2024), 14-20 Apr 2024, Lisbon, Portugal. Association for Computing Machinery (ACM) , New York, NY, United States ISBN 9798400702174
Abstract
Non-deterministic test behavior, or flakiness, is common and dreaded among developers. Researchers have studied the issue and proposed approaches to mitigate it. However, the vast majority of previous work has only considered developer-written tests. The prevalence and nature of flaky tests produced by test generation tools remain largely unknown. We ask whether such tools also produce flaky tests and how these differ from developer-written ones. Furthermore, we evaluate mechanisms that suppress flaky test generation. We sample 6 356 projects written in Java or Python. For each project, we generate tests using EvoSuite (Java) and Pynguin (Python), and execute each test 200 times, looking for inconsistent outcomes. Our results show that flakiness is at least as common in generated tests as in developer-written tests. Nevertheless, existing flakiness suppression mechanisms implemented in EvoSuite are effective in alleviating this issue (71.7 % fewer flaky tests). Compared to developer-written flaky tests, the causes of generated flaky tests are distributed differently. Their non-deterministic behavior is more frequently caused by randomness, rather than by networking and concurrency. Using flakiness suppression, the remaining flaky tests differ significantly from any flakiness previously reported, where most are attributable to runtime optimizations and EvoSuiteinternal resource thresholds. These insights, with the accompanying dataset, can help maintainers to improve test generation tools, give recommendations for developers using these tools, and serve as a foundation for future research in test flakiness or test generation.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2024 Owner/Author(s). This work is licensed under a Creative Commons Attribution International 4.0 License. https://creativecommons.org/licenses/by/4.0/ |
Keywords: | Test Generation; Flaky Tests; Empirical Study |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Funding Information: | Funder Grant number ENGINEERING AND PHYSICAL SCIENCE RESEARCH COUNCIL EP/X024539/1 META PLATFORM INC 2537673289630253 |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 06 Oct 2023 08:28 |
Last Modified: | 12 Feb 2024 10:00 |
Status: | Published |
Publisher: | Association for Computing Machinery (ACM) |
Refereed: | Yes |
Identification Number: | 10.1145/3597503.3608138 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:203874 |