LLMs and prompting for unit test generation: a large-scale evaluation
buir.contributor.author | Koyuncu, Anıl | |
dc.citation.epage | 2465 | |
dc.citation.spage | 2464 | |
dc.contributor.author | Koyuncu, Anıl | |
dc.contributor.author | Ouedraogo, Wendkuuni C. | |
dc.contributor.author | Kabore, Kader | |
dc.contributor.author | Tian, Haoye | |
dc.contributor.author | Song, Yewei | |
dc.contributor.author | Klein, Jacques | |
dc.contributor.author | Lo, David | |
dc.contributor.author | Bissyande, Tegawende F. | |
dc.coverage.spatial | Sacramento, California, United States | |
dc.date.accessioned | 2025-02-21T10:59:20Z | |
dc.date.available | 2025-02-21T10:59:20Z | |
dc.date.issued | 2024-11-01 | |
dc.department | Department of Computer Engineering | |
dc.description | Conference Name: Proceedings - 39th ACM/IEEE International Conference on Automated Software Engineering, ASE 2024 | |
dc.description | Date of Conference: 28 October 2024 - 1 November 2024 | |
dc.description.abstract | Unit testing, essential for identifying bugs, is often neglected due to time constraints. Automated test generation tools exist but typically lack readability and require developer intervention. Large Language Models (LLMs) like GPT and Mistral show potential in test generation, but their effectiveness remains unclear.This study evaluates four LLMs and five prompt engineering techniques, analyzing 216 300 tests for 690 Java classes from diverse datasets. We assess correctness, readability, coverage, and bug detection, comparing LLM-generated tests to EvoSuite. While LLMs show promise, improvements in correctness are needed. The study highlights both the strengths and limitations of LLMs, offering insights for future research. © 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM. | |
dc.description.provenance | Submitted by Serdar Sevin (serdar.sevin@bilkent.edu.tr) on 2025-02-21T10:59:20Z No. of bitstreams: 1 LLMs_and_Prompting_for_Unit_Test_Generation_A_Large_Scale_Evaluation.pdf: 777952 bytes, checksum: 6f25dcf702b7b32fa3c3111dcdde2dd4 (MD5) | en |
dc.description.provenance | Made available in DSpace on 2025-02-21T10:59:20Z (GMT). No. of bitstreams: 1 LLMs_and_Prompting_for_Unit_Test_Generation_A_Large_Scale_Evaluation.pdf: 777952 bytes, checksum: 6f25dcf702b7b32fa3c3111dcdde2dd4 (MD5) Previous issue date: 2024-11-01 | en |
dc.identifier.doi | 10.1145/3691620.3695330 | |
dc.identifier.isbn | 979-840071248-7 | |
dc.identifier.uri | https://hdl.handle.net/11693/116554 | |
dc.language.iso | English | |
dc.publisher | Association for Computing Machinery, Inc | |
dc.relation.isversionof | https://dx.doi.org/10.1145/3691620.3695330 | |
dc.rights | CC BY 4.0 DEED (Attribution 4.0 International ) | |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
dc.source.title | Association for Computing | |
dc.subject | Automatic test generation | |
dc.subject | Empirical evaluation | |
dc.subject | Large language models | |
dc.subject | Prompt engineering | |
dc.subject | Unit tests | |
dc.title | LLMs and prompting for unit test generation: a large-scale evaluation | |
dc.type | Conference Paper |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- LLMs_and_Prompting_for_Unit_Test_Generation_A_Large_Scale_Evaluation.pdf
- Size:
- 759.72 KB
- Format:
- Adobe Portable Document Format
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description: