Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3524842.3528009acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
short-paper

Methods2Test: a dataset of focal methods mapped to test cases

Published: 17 October 2022 Publication History

Abstract

Unit testing is an essential part of the software development process, which helps to identify issues with source code in early stages of development and prevent regressions. Machine learning has emerged as viable approach to help software developers generate automated unit tests. However, generating reliable unit test cases that are semantically correct and capable of catching software bugs or unintended behavior via machine learning requires large, metadata-rich, datasets. In this paper we present Methods2Test: a large, supervised dataset of test cases mapped to corresponding methods under test (i.e., focal methods). This dataset contains 780,944 pairs of JUnit tests and focal methods, extracted from a total of 91,385 Java open source projects hosted on GitHub with licenses permitting re-distribution. The main challenge behind the creation of the Methods2Test was to establish a reliable mapping between a test case and the relevant focal method. To this aim, we designed a set of heuristics, based on developers' best practices in software testing, which identify the likely focal method for a given test case. To facilitate further analysis, we store a rich set of metadata for each method-test pair in JSON-formatted files. Additionally, we extract textual corpus from the dataset at different context levels, which we provide both in raw and tokenized forms, in order to enable researchers to train and evaluate machine learning models for Automated Test Generation. Methods2Test is publicly available at: https://github.com/microsoft/methods2test

References

[1]
[n. d.]. Tree-sitter. http://tree-sitter.github.io/tree-sitter.
[2]
2019. PROMISE. http://promisedata.org/.
[3]
Agitar. 2020. Utilizing Fast Testing to Transform Java Development into an Agile, Quick Release, Low Risk Process. http://www.agitar.com/.
[4]
M Moein Almasi, Hadi Hemmati, Gordon Fraser, Andrea Arcuri, and Janis Benefelds. 2017. An industrial evaluation of unit test generation: Finding real faults in a financial application. In 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP). IEEE, 263--272.
[5]
Zimin Chen, Steve James Kommrusch, Michele Tufano, Louis-Noël Pouchet, Denys Poshyvanyk, and Martin Monperrus. 2019. Sequencer: Sequence-to-sequence learning for end-to-end program repair. IEEE Transactions on Software Engineering (2019).
[6]
Colin B Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, and Neel Sundaresan. 2020. PyMT5: multi-mode translation of natural language and Python code with transformers. arXiv preprint arXiv:2010.03150 (2020).
[7]
Ermira Daka, José Campos, Gordon Fraser, Jonathan Dorn, and Westley Weimer. 2015. Modeling readability to improve unit tests. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 107--118.
[8]
Ermira Daka and Gordon Fraser. 2014. A survey on unit testing practices and problems. In 2014 IEEE 25th International Symposium on Software Reliability Engineering. IEEE, 201--211.
[9]
Rudolf Ferenc, Zoltán Tóth, Gergely Ladányi, István Siket, and Tibor Gyimóthy. 2018. A public unified bug dataset for java. In Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineering. 12--21.
[10]
Gordon Fraser and Andrea Arcuri. 2011. Evosuite: automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. 416--419.
[11]
Giovanni Grano, Fabio Palomba, Dario Di Nucci, Andrea De Lucia, and Harald C Gall. 2019. Scented since the beginning: On the diffuseness of test smells in automatically generated test code. Journal of Systems and Software 156 (2019), 312--327.
[12]
Giovanni Grano, Simone Scalabrino, Harald C Gall, and Rocco Oliveto. 2018. An empirical investigation on the readability of manual and generated test cases. In 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC). IEEE, 348--3483.
[13]
Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2018. Deep code comment generation. In 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC). IEEE, 200--20010.
[14]
René Just, Darioush Jalali, and Michael D Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis. 437--440.
[15]
András Kicsi, László Vidács, and Tibor Gyimóthy. 2020. Testroutes: A manually curated method level dataset for test-to-code traceability. In Proceedings of the 17th International Conference on Mining Software Repositories. 593--597.
[16]
Microsoft. 2020. methods2test. https://github.com/microsoft/methods2test.
[17]
Carlos Pacheco and Michael D Ernst. 2007. Randoop: feedback-directed random testing for Java. In Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companion. 815--816.
[18]
Fabio Palomba, Dario Di Nucci, Annibale Panichella, Rocco Oliveto, and Andrea De Lucia. 2016. On the diffusion of test smells in automatically generated test code: An empirical study. In 2016 IEEE/ACM 9th International Workshop on Search-Based Software Testing (SBST). IEEE, 5--14.
[19]
Fabio Palomba, Dario Di Nucci, Michele Tufano, Gabriele Bavota, Rocco Oliveto, Denys Poshyvanyk, and Andrea De Lucia. 2015. Landfill: An open dataset of code smells with public evaluation. In 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories. IEEE, 482--485.
[20]
Fabio Palomba, Annibale Panichella, Andy Zaidman, Rocco Oliveto, and Andrea De Lucia. 2016. Automatic test case generation: What if test code quality matters?. In Proceedings of the 25th International Symposium on Software Testing and Analysis. 130--141.
[21]
Gustavo HL Pinto and Silvia R Vergilio. 2010. A multi-objective genetic algorithm to test data generation. In 2010 22nd IEEE International Conference on Tools with Artificial Intelligence, Vol. 1. IEEE, 129--134.
[22]
Sina Shamshiri. 2015. Automated unit test generation for evolving software. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 1038--1041.
[23]
Martin Shepperd, Qinbao Song, Zhongbin Sun, and Carolyn Mair. 2013. Data quality: Some comments on the nasa software defect datasets. IEEE Transactions on Software Engineering 39, 9 (2013), 1208--1215.
[24]
Alexey Svyatkovskiy, Todd Mytkowicz, Negar Ghorbani, Sarah Fakhoury, Elizabeth Dinella, Christian Bird, Neel Sundaresan, and Shuvendu K. Lahiri. 2021. MergeBERT: Program Merge Conflict Resolution via Neural Transformers. CoRR abs/2109.00084 (2021). arXiv:2109.00084 https://arxiv.org/abs/2109.00084
[25]
Rosalia Tufan, Luca Pascarella, Michele Tufanoy, Denys Poshyvanykz, and Gabriele Bavota. 2021. Towards Automating Code Review Activities. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 163--174.
[26]
Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Shao Kun Deng, and Neel Sundaresan. 2021. Unit Test Case Generation with Transformers and Focal Context. arXiv:2009.05617 [cs.SE]
[27]
Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2019. An empirical study on learning bug-fixing patches in the wild via neural machine translation. ACM Transactions on Software Engineering and Methodology (TOSEM) 28, 4 (2019), 1--29.
[28]
W Eric Wong, Ruizhi Gao, Yihao Li, Rui Abreu, and Franz Wotawa. 2016. A survey on software fault localization. IEEE Transactions on Software Engineering 42, 8 (2016), 707--740.

Cited By

View all
  • (2024)Combining Coverage and Expert Features with Semantic Representation for Coincidental Correctness DetectionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695542(1770-1782)Online publication date: 27-Oct-2024
  • (2024)AgoneTest: Automated creation and assessment of Unit tests leveraging Large Language ModelsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695318(2440-2441)Online publication date: 27-Oct-2024
  • (2024)An Empirical Study on Focal Methods in Deep-Learning-Based Approaches for Assertion GenerationProceedings of the ACM on Software Engineering10.1145/36607851:FSE(1750-1771)Online publication date: 12-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MSR '22: Proceedings of the 19th International Conference on Mining Software Repositories
May 2022
815 pages
ISBN:9781450393034
DOI:10.1145/3524842
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. datasets
  2. software testing

Qualifiers

  • Short-paper

Conference

MSR '22
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)102
  • Downloads (Last 6 weeks)14
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Combining Coverage and Expert Features with Semantic Representation for Coincidental Correctness DetectionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695542(1770-1782)Online publication date: 27-Oct-2024
  • (2024)AgoneTest: Automated creation and assessment of Unit tests leveraging Large Language ModelsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695318(2440-2441)Online publication date: 27-Oct-2024
  • (2024)An Empirical Study on Focal Methods in Deep-Learning-Based Approaches for Assertion GenerationProceedings of the ACM on Software Engineering10.1145/36607851:FSE(1750-1771)Online publication date: 12-Jul-2024
  • (2024)Domain Adaptation for Code Model-Based Unit Test Case GenerationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680354(1211-1222)Online publication date: 11-Sep-2024
  • (2024)Towards AI-Assisted Synthesis of Verified Dafny MethodsProceedings of the ACM on Software Engineering10.1145/36437631:FSE(812-835)Online publication date: 12-Jul-2024
  • (2024)From Fine-tuning to Output: An Empirical Investigation of Test Smells in Transformer-Based Test Code GenerationProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3636058(1282-1291)Online publication date: 8-Apr-2024
  • (2024)Method-Level Test-to-Code Traceability Link Construction by Semantic Correlation LearningIEEE Transactions on Software Engineering10.1109/TSE.2024.344991750:10(2656-2676)Online publication date: Oct-2024
  • (2024)Assessing Evaluation Metrics for Neural Test Oracle GenerationIEEE Transactions on Software Engineering10.1109/TSE.2024.343346350:9(2337-2349)Online publication date: 25-Jul-2024
  • (2024)Measuring Software Testability via Automatically Generated Test CasesIEEE Access10.1109/ACCESS.2024.339662512(63904-63916)Online publication date: 2024
  • (2024)A3Test: Assertion-Augmented Automated Test case generationInformation and Software Technology10.1016/j.infsof.2024.107565(107565)Online publication date: Aug-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media