short-paper

Methods2Test: a dataset of focal methods mapped to test cases

Authors:

Michele Tufano,

Neel Sundaresan,

Alexey SvyatkovskiyAuthors Info & Claims

MSR '22: Proceedings of the 19th International Conference on Mining Software Repositories

Pages 299 - 303

https://doi.org/10.1145/3524842.3528009

Published: 17 October 2022 Publication History

Abstract

Unit testing is an essential part of the software development process, which helps to identify issues with source code in early stages of development and prevent regressions. Machine learning has emerged as viable approach to help software developers generate automated unit tests. However, generating reliable unit test cases that are semantically correct and capable of catching software bugs or unintended behavior via machine learning requires large, metadata-rich, datasets. In this paper we present Methods2Test: a large, supervised dataset of test cases mapped to corresponding methods under test (i.e., focal methods). This dataset contains 780,944 pairs of JUnit tests and focal methods, extracted from a total of 91,385 Java open source projects hosted on GitHub with licenses permitting re-distribution. The main challenge behind the creation of the Methods2Test was to establish a reliable mapping between a test case and the relevant focal method. To this aim, we designed a set of heuristics, based on developers' best practices in software testing, which identify the likely focal method for a given test case. To facilitate further analysis, we store a rich set of metadata for each method-test pair in JSON-formatted files. Additionally, we extract textual corpus from the dataset at different context levels, which we provide both in raw and tokenized forms, in order to enable researchers to train and evaluate machine learning models for Automated Test Generation. Methods2Test is publicly available at: https://github.com/microsoft/methods2test

References

[1]

[n. d.]. Tree-sitter. http://tree-sitter.github.io/tree-sitter.

[2]

2019. PROMISE. http://promisedata.org/.

[3]

Agitar. 2020. Utilizing Fast Testing to Transform Java Development into an Agile, Quick Release, Low Risk Process. http://www.agitar.com/.

[4]

M Moein Almasi, Hadi Hemmati, Gordon Fraser, Andrea Arcuri, and Janis Benefelds. 2017. An industrial evaluation of unit test generation: Finding real faults in a financial application. In 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP). IEEE, 263--272.

Digital Library

[5]

Zimin Chen, Steve James Kommrusch, Michele Tufano, Louis-Noël Pouchet, Denys Poshyvanyk, and Martin Monperrus. 2019. Sequencer: Sequence-to-sequence learning for end-to-end program repair. IEEE Transactions on Software Engineering (2019).

[6]

Colin B Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, and Neel Sundaresan. 2020. PyMT5: multi-mode translation of natural language and Python code with transformers. arXiv preprint arXiv:2010.03150 (2020).

[7]

Ermira Daka, José Campos, Gordon Fraser, Jonathan Dorn, and Westley Weimer. 2015. Modeling readability to improve unit tests. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 107--118.

Digital Library

[8]

Ermira Daka and Gordon Fraser. 2014. A survey on unit testing practices and problems. In 2014 IEEE 25th International Symposium on Software Reliability Engineering. IEEE, 201--211.

Digital Library

[9]

Rudolf Ferenc, Zoltán Tóth, Gergely Ladányi, István Siket, and Tibor Gyimóthy. 2018. A public unified bug dataset for java. In Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineering. 12--21.

Digital Library

[10]

Gordon Fraser and Andrea Arcuri. 2011. Evosuite: automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. 416--419.

Digital Library

[11]

Giovanni Grano, Fabio Palomba, Dario Di Nucci, Andrea De Lucia, and Harald C Gall. 2019. Scented since the beginning: On the diffuseness of test smells in automatically generated test code. Journal of Systems and Software 156 (2019), 312--327.

Digital Library

[12]

Giovanni Grano, Simone Scalabrino, Harald C Gall, and Rocco Oliveto. 2018. An empirical investigation on the readability of manual and generated test cases. In 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC). IEEE, 348--3483.

Digital Library

[13]

Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2018. Deep code comment generation. In 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC). IEEE, 200--20010.

Digital Library

[14]

René Just, Darioush Jalali, and Michael D Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis. 437--440.

Digital Library

[15]

András Kicsi, László Vidács, and Tibor Gyimóthy. 2020. Testroutes: A manually curated method level dataset for test-to-code traceability. In Proceedings of the 17th International Conference on Mining Software Repositories. 593--597.

Digital Library

[16]

Microsoft. 2020. methods2test. https://github.com/microsoft/methods2test.

[17]

Carlos Pacheco and Michael D Ernst. 2007. Randoop: feedback-directed random testing for Java. In Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companion. 815--816.

[18]

Fabio Palomba, Dario Di Nucci, Annibale Panichella, Rocco Oliveto, and Andrea De Lucia. 2016. On the diffusion of test smells in automatically generated test code: An empirical study. In 2016 IEEE/ACM 9th International Workshop on Search-Based Software Testing (SBST). IEEE, 5--14.

Digital Library

[19]

Fabio Palomba, Dario Di Nucci, Michele Tufano, Gabriele Bavota, Rocco Oliveto, Denys Poshyvanyk, and Andrea De Lucia. 2015. Landfill: An open dataset of code smells with public evaluation. In 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories. IEEE, 482--485.

[20]

Fabio Palomba, Annibale Panichella, Andy Zaidman, Rocco Oliveto, and Andrea De Lucia. 2016. Automatic test case generation: What if test code quality matters?. In Proceedings of the 25th International Symposium on Software Testing and Analysis. 130--141.

Digital Library

[21]

Gustavo HL Pinto and Silvia R Vergilio. 2010. A multi-objective genetic algorithm to test data generation. In 2010 22nd IEEE International Conference on Tools with Artificial Intelligence, Vol. 1. IEEE, 129--134.

Digital Library

[22]

Sina Shamshiri. 2015. Automated unit test generation for evolving software. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 1038--1041.

Digital Library

[23]

Martin Shepperd, Qinbao Song, Zhongbin Sun, and Carolyn Mair. 2013. Data quality: Some comments on the nasa software defect datasets. IEEE Transactions on Software Engineering 39, 9 (2013), 1208--1215.

Digital Library

[24]

Alexey Svyatkovskiy, Todd Mytkowicz, Negar Ghorbani, Sarah Fakhoury, Elizabeth Dinella, Christian Bird, Neel Sundaresan, and Shuvendu K. Lahiri. 2021. MergeBERT: Program Merge Conflict Resolution via Neural Transformers. CoRR abs/2109.00084 (2021). arXiv:2109.00084 https://arxiv.org/abs/2109.00084

[25]

Rosalia Tufan, Luca Pascarella, Michele Tufanoy, Denys Poshyvanykz, and Gabriele Bavota. 2021. Towards Automating Code Review Activities. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 163--174.

[26]

Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Shao Kun Deng, and Neel Sundaresan. 2021. Unit Test Case Generation with Transformers and Focal Context. arXiv:2009.05617 [cs.SE]

[27]

Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2019. An empirical study on learning bug-fixing patches in the wild via neural machine translation. ACM Transactions on Software Engineering and Methodology (TOSEM) 28, 4 (2019), 1--29.

Digital Library

[28]

W Eric Wong, Ruizhi Gao, Yihao Li, Rui Abreu, and Franz Wotawa. 2016. A survey on software fault localization. IEEE Transactions on Software Engineering 42, 8 (2016), 707--740.

Digital Library

Cited By

Xie HLei YLi MYan MZhang SFilkov VRay BZhou M(2024)Combining Coverage and Expert Features with Semantic Representation for Coincidental Correctness DetectionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695542(1770-1782)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695542
Lops ANarducci FRagone ATrizio MFilkov VRay BZhou M(2024)AgoneTest: Automated creation and assessment of Unit tests leveraging Large Language ModelsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695318(2440-2441)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695318
He YHuang JYu HXie T(2024)An Empirical Study on Focal Methods in Deep-Learning-Based Approaches for Assertion GenerationProceedings of the ACM on Software Engineering10.1145/36607851:FSE(1750-1771)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660785
Show More Cited By

Index Terms

Methods2Test: a dataset of focal methods mapped to test cases
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

A Static Approach to Prioritizing JUnit Test Cases

Test case prioritization is used in regression testing to schedule the execution order of test cases so as to expose faults earlier in testing. Over the past few years, many test case prioritization techniques have been proposed in the literature. Most ...
Prioritizing Variable-Strength Covering Array
COMPSAC '13: Proceedings of the 2013 IEEE 37th Annual Computer Software and Applications Conference

Combinatorial interaction testing is a well-studied testing strategy, and has been widely applied in practice. Combinatorial interaction test suite, such as fixed-strength and variable-strength interaction test suite, is widely used for combinatorial ...
Improving Fault Detection Capability by Selectively Retaining Test Cases during Test Suite Reduction

Software testing is a critical part of software development. As new test cases are generated over time due to software modifications, test suite sizes may grow significantly. Because of time and resource constraints for testing, test suite minimization ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MSR '22: Proceedings of the 19th International Conference on Mining Software Repositories

May 2022

815 pages

ISBN:9781450393034

DOI:10.1145/3524842

General Chair:
David Lo
Singapore Management University, Singapore
,
Program Chairs:
Shane McIntosh
University of Waterloo, Canada
,
Nicole Novielli
University of Bari, Italy

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

MSR '22

Sponsor:

SIGSOFT

MSR '22: 19th International Conference on Mining Software Repositories

May 23 - 24, 2022

Pennsylvania, Pittsburgh

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
175
Total Downloads

Downloads (Last 12 months)102
Downloads (Last 6 weeks)14

Reflects downloads up to 16 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Xie HLei YLi MYan MZhang SFilkov VRay BZhou M(2024)Combining Coverage and Expert Features with Semantic Representation for Coincidental Correctness DetectionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695542(1770-1782)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695542
Lops ANarducci FRagone ATrizio MFilkov VRay BZhou M(2024)AgoneTest: Automated creation and assessment of Unit tests leveraging Large Language ModelsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695318(2440-2441)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695318
He YHuang JYu HXie T(2024)An Empirical Study on Focal Methods in Deep-Learning-Based Approaches for Assertion GenerationProceedings of the ACM on Software Engineering10.1145/36607851:FSE(1750-1771)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660785
Shin JHashtroudi SHemmati HWang SChristakis MPradel M(2024)Domain Adaptation for Code Model-Based Unit Test Case GenerationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680354(1211-1222)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680354
Misu MLopes CMa INoble J(2024)Towards AI-Assisted Synthesis of Verified Dafny MethodsProceedings of the ACM on Software Engineering10.1145/36437631:FSE(812-835)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643763
Aljohani ADo HHong JPark J(2024)From Fine-tuning to Output: An Empirical Investigation of Test Smells in Transformer-Based Test Code GenerationProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3636058(1282-1291)Online publication date: 8-Apr-2024
https://dl.acm.org/doi/10.1145/3605098.3636058
Sun WGuo ZYan MLiu ZLei YZhang H(2024)Method-Level Test-to-Code Traceability Link Construction by Semantic Correlation LearningIEEE Transactions on Software Engineering10.1109/TSE.2024.344991750:10(2656-2676)Online publication date: Oct-2024
https://doi.org/10.1109/TSE.2024.3449917
Shin JHemmati HWei MWang S(2024)Assessing Evaluation Metrics for Neural Test Oracle GenerationIEEE Transactions on Software Engineering10.1109/TSE.2024.343346350:9(2337-2349)Online publication date: 25-Jul-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3433463
Guglielmo LMariani LDenaro G(2024)Measuring Software Testability via Automatically Generated Test CasesIEEE Access10.1109/ACCESS.2024.339662512(63904-63916)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3396625
Alagarsamy STantithamthavorn CAleti A(2024)A3Test: Assertion-Augmented Automated Test case generationInformation and Software Technology10.1016/j.infsof.2024.107565(107565)Online publication date: Aug-2024
https://doi.org/10.1016/j.infsof.2024.107565
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents