research-article

To Share, or Not to Share: Exploring Test-Case Reusability in Fork Ecosystems

Authors:

Mukelabai Mukelabai,

Christoph Derks,

Thorsten BergerAuthors Info & Claims

ASE '23: Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering

Pages 837 - 849

https://doi.org/10.1109/ASE56229.2023.00191

Published: 26 September 2024 Publication History

Abstract

Code is often reused to facilitate collaborative development, to create software variants, to experiment with new ideas, or to develop new features in isolation. Social-coding platforms, such as GitHub, enable enhanced code reuse with forking, pull requests, and cross-project traceability. With these concepts, forking has become a common strategy to reuse code by creating clones (i.e., forks) of projects. Thereby, forking establishes fork ecosystems of co-existing projects that are similar, but developed in parallel, often with rather sporadic code propagation and synchronization. Consequently, forked projects vary in quality and often involve redundant development efforts. Unfortunately, as we will show, many projects do not benefit from test cases created in other forks, even though those test cases could actually be reused to enhance the quality of other projects. We believe that reusing test cases---in addition to the implementation code---can improve software quality, software maintainability, and coding efficiency in fork ecosystems. While researchers have worked on test-case-reuse techniques, their potential to improve the quality of real fork ecosystems is unknown. To shed light on test-case reusability, we study to what extent test cases can be reused across forked projects. We mined a dataset of test cases from 305 fork ecosystems on GitHub---totaling 1,089 projects---and assessed the potential for reusing these test cases among the forked projects. By performing a manual inspection of the test cases' applicability, by transplanting the test cases, and by analyzing the causes of non-applicability, we contribute an understanding of the benefits (e.g., uncovering bugs) and of the challenges (e.g., automated code transplantation, deciding about applicability) of reusing test cases in fork ecosystems.

References

[1]

"Appendix." [Online]. Available: https://bitbucket.org/easelab/testcasepropagationappendix

[2]

J. Åkesson, S. Nilsson, J. Krüger, and T. Berger, "Migrating the Android Apo-Games into an Annotation-Based Software Product Line," in International Systems and Software Product Line Conference (SPLC). ACM, 2019, pp. 103--107.

[3]

H. M. AlGhmadi, M. D. Syer, W. Shang, and A. E. Hassan, "An Automated Approach for Recommending When to Stop Performance Tests," in International Conference on Software Maintenance and Evolution (ICSME), 2016, pp. 279--289.

[4]

M. M. Almasi, H. Hemmati, G. Fraser, A. Arcuri, and J. Benefelds, "An Industrial Evaluation of Unit Test Generation: Finding Real Faults in a Financial Application," in International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP). IEEE, 2017, pp. 263--272.

[5]

S. Baltes and S. Diehl, "Usage and Attribution of Stack Overflow Code Snippets in GitHub Projects," Empirical Software Engineering, vol. 24, no. 3, pp. 1259--1295, 2019.

Digital Library

[6]

E. T. Barr, M. Harman, Y. Jia, A. Marginean, and J. Petke, "Automated Software Transplantation," in International Symposium on Software Testing and Analysis (ISSTA). ACM, 2015, pp. 257--269.

[7]

T. Berger, J.-P. Steghöfer, T. Ziadi, J. Robin, and J. Martinez, "The State of Adoption and the Challenges of Systematic Variability Management in Industry," Empirical Software Engineering, vol. 25, pp. 1755--1797, 2020.

Digital Library

[8]

J. Businge, O. Moses, S. Nadi, E. Bainomugisha, and T. Berger, "Clone-Based Variability Management in the Android Ecosystem," in International Conference on Software Maintenance and Evolution (ICSME). IEEE, 2018, pp. 625--634.

[9]

J. Businge, O. Moses, S. Nadi, and T. Berger, "Reuse and Maintenance Practices among Divergent Forks in Three Software Ecosystems," Empirical Software Engineering, vol. 27, no. 2, pp. 54:1--47, 2022.

Digital Library

[10]

B. Daniel, V. Jagannath, D. Dig, and D. Marinov, "ReAssert: Suggesting Repairs for Broken Unit Tests," in International Conference on Automated Software Engineering (ASE). IEEE, 2009, pp. 433--444.

[11]

J. Debbiche, O. Lignell, J. Krüger, and T. Berger, "Migrating Java-Based Apo-Games into a Composition-Based Software Product Line," in International Systems and Software Product Line Conference (SPLC). ACM, 2019, pp. 98--102.

[12]

C. Derks, D. Strüber, and T. Berger, "A benchmark generator framework for evolving variant-rich software," Journal of Systems and Software, vol. 203, p. 111736, 2023.

Digital Library

[13]

X. Devroey, S. Panichella, and A. Gambi, "Java Unit Testing Tool Competition: Eighth Round," in International Conference on Software Engineering Workshops (ICSEW). ACM, 2020, pp. 545--548.

[14]

Y. Dubinsky, J. Rubin, T. Berger, S. Duszynski, M. Becker, and K. Czarnecki, "An exploratory study of cloning in industrial software product lines," in CSMR, 2013.

[15]

T. Durieux, F. Madeiral, M. Martinez, and R. Abreu, "Empirical Review of Java Program Repair Tools: A Large-Scale Experiment on 2,141 Bugs and 23,551 Repair Attempts," in Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 2019, pp. 302--313.

[16]

E. Engström and P. Runeson, "Software Product Line Testing - A Systematic Mapping Study," Information and Software Technology, vol. 53, no. 1, pp. 2--13, 2011.

Digital Library

[17]

J. Falleri, F. Morandat, X. Blanc, M. Martinez, and M. Monperrus, "Fine-Grained and Accurate Source Code Differencing," in International Conference on Automated Software Engineering (ASE), 2014, pp. 313--324.

[18]

S. Fischer, R. Ramler, L. Linsbauer, and A. Egyed, "Automating Test Reuse for Highly Configurable Software," in International Systems and Software Product Line Conference (SPLC). ACM, 2019, pp. 1--11.

[19]

L. Gazzola, D. Micucci, and L. Mariani, "Automatic Software Repair: A Survey," IEEE Transactions on Software Engineering, vol. 45, no. 1, pp. 34--67, 2019.

Digital Library

[20]

M. Gharehyazie, B. Ray, and V. Filkov, "Some from Here, Some from There: Cross-Project Code Reuse in Github," in International Coference on Mining Software Repositories (MSR). IEEE, 2017, pp. 291--301.

[21]

R. Gopinath, C. Jensen, and A. Groce, "Mutations: How Close Are They to Real Faults?" in International Symposium on Software Reliability Engineering (ISSRE). IEEE, 2014, pp. 189--200.

[22]

G. Gousios, "The GHTorent Dataset and Tool Suite," in International Working Conference on Mining Software Repositories (MSR). IEEE, 2013, pp. 233--236.

[23]

G. Gousios, M. Pinzger, and A. van Deursen, "An Exploratory Study of the Pull-Based Software Development Model," in International Conference on Software Engineering (ICSE). ACM, 2014, pp. 345--355.

[24]

G. Gousios, M.-A. Storey, and A. Bacchelli, "Work Practices and Challenges in Pull-Based Development: The Contributor's Perspective," in International Conference on Software Engineering (ICSE). ACM, 2016, pp. 285--296.

[25]

G. Gousios, A. Zaidman, M.-A. Storey, and A. van Deursen, "Work Practices and Challenges in Pull-Based Development: The Integrator's Perspective," in International Conference on Software Engineering (ICSE). IEEE, 2015, pp. 358--368.

[26]

G. Hu, L. Zhu, and J. Yang, "AppFlow: Using Machine Learning to Synthesize Robust, Reusable UI Tests," in Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 2018, pp. 269--282.

[27]

Y. Jia and M. Harman, "An Analysis and Survey of the Development of Mutation Testing," IEEE Transactions on Software Engineering, vol. 37, no. 5, pp. 649--678, 2011.

Digital Library

[28]

J. Jiang, D. Lo, J. He, X. Xia, P. S. Kochhar, and L. Zhang, "Why and How Developers Fork What from Whom in GitHub," Empirical Software Engineering, vol. 22, no. 1, pp. 547--578, 2017.

Digital Library

[29]

L. Jiang, G. Misherghi, Z. Su, and S. Glondu, "Deckard: Scalable and Accurate Tree-Based Detection of Code Clones," in International Conference on Software Engineering (ICSE). IEEE, 2007, pp. 96--105.

[30]

T. Kamiya, S. Kusumoto, and K. Inoue, "CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code," IEEE Transactions on Software Engineering, vol. 28, no. 7, pp. 654--670, 2002.

Digital Library

[31]

N. Kawamitsu, T. Ishio, T. Kanda, R. G. Kula, C. De Roover, and K. Inoue, "Identifying source code reuse across repositories using lcs-based source code similarity," in International Working Conference on Source Code Analysis and Manipulation (SCAM). IEEE, 2014, pp. 305--314.

[32]

A. Koyuncu, K. Liu, T. F. Bissyandé, D. Kim, J. Klein, M. Monperrus, and Y. Le Traon, "FixMiner: Mining Relevant Fix Patterns for Automated Program Repair," Empirical Software Engineering, vol. 25, no. 3, pp. 1980--2024, 2020.

Digital Library

[33]

S. Krieter, J. Krüger, T. Leich, and G. Saake, "VariantInc: Automatically Pruning and Integrating Versioned Software Variants," in International Systems and Software Product Line Conference (SPLC). ACM, 2023.

[34]

J. Krüger, M. Al-Hajjaji, S. Schulze, G. Saake, and T. Leich, "Towards Automated Test Refactoring for Software Product Lines," in International Systems and Software Product Line Conference (SPLC). ACM, 2018, pp. 143--148.

[35]

J. Krüger and T. Berger, "An Empirical Analysis of the Costs of Clone- and Platform-Oriented Software Reuse," in Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 2020, pp. 432--444.

[36]

J. Krüger, A. Mikulinski, S. Schulze, T. Leich, and G. Saake, "DSDGen: Extracting Documentation to Comprehend Fork Merges," in International Systems and Software Product Line Conference (SPLC). ACM, 2016.

[37]

J. Krüger, M. Mukelabai, W. Gu, H. Shen, R. Hebig, and T. Berger, "Where is My Feature and What is it About? A Case Study on Recovering Feature Facets," Journal of Systems and Software, vol. 152, pp. 239--253, 2019.

Digital Library

[38]

X.-B. D. Le, D. Lo, and C. Le Goues, "History Driven Program Repair," in International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 2016, pp. 213--224.

[39]

X. Li, M. d'Amorim, and A. Orso, "Intent-Preserving Test Repair," in International Conference on Software Testing, Verification and Validation (ICST). IEEE, 2019, pp. 217--227.

[40]

Y. Li and N. J. Wahl, "An Overview of Regression Testing," ACM SIGSOFT Software Engineering Notes, vol. 24, no. 1, pp. 69--73, 1999.

Digital Library

[41]

M. Lillack, Ş. Stănciulescu, W. Hedman, T. Berger, and A. Wąsowski, "Intention-Based Integration of Software Variants," in International Conference on Software Engineering (ICSE). IEEE, 2019, pp. 831--842.

[42]

J.-W. Lin, R. Jabbarvand, and S. Malek, "Test Transfer Across Mobile Apps Through Semantic Mapping," in International Conference on Automated Software Engineering (ASE). IEEE, 2019, pp. 42--53.

[43]

K. Liu, A. Koyuncu, D. Kim, and T. F. Bissyandé, "TBar: Revisiting Template-Based Automated Program Repair," in International Symposium on Software Testing and Analysis (ISSTA). ACM, 2019, pp. 31--42.

[44]

X. Liu and H. Zhong, "Mining StackOverflow for Program Repair," in International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 2018, pp. 118--129.

[45]

W. Ma, L. Chen, X. Zhang, Y. Zhou, and B. Xu, "How Do Developers Fix Cross-Project Correlated Bugs? A Case Study on the GitHub Scientific Python Ecosystem," in International Conference on Software Engineering (ICSE). IEEE, 2017, pp. 381--392.

[46]

W. Mahmood, D. Strueber, T. Berger, R. Laemmel, and M. Mukelabai, "Seamless variability management with the virtual platform," in 43rd International Conference on Software Engineering (ICSE), 2021.

[47]

J. I. Maletic, M. L. Collard, and A. Marcus, "Source Code Files as Structured Documents," in International Workshop on Program Comprehension (IWPC). IEEE, 2002, pp. 289--292.

[48]

M. Mirzaaghaei, F. Pastore, and M. Pezzè, "Supporting Test Suite Evolution through Test Case Adaptation," in International Conference on Software Testing, Verification and Validation (ICST). IEEE, 2012, pp. 231--240.

[49]

M. Mondal, B. Roy, C. K. Roy, and K. A. Schneider, "An Empirical Study on Bug Propagation Through Code Cloning," Journal of Systems and Software, vol. 158, pp. 110 407:1--18, 2019.

Digital Library

[50]

M. Mondal, C. K. Roy, and K. A. Schneider, "Bug Propagation Through Code Cloning: An Empirical Study," in International Conference on Software Maintenance and Evolution (ICSME). IEEE, 2017, pp. 227--237.

[51]

M. Monperrus, "Automatic Software Repair: A Bibliography," ACM Computing Surveys, vol. 51, no. 1, pp. 17:1--24, 2018.

Digital Library

[52]

M. Mukelabai, T. Berger, and P. Borba, "Semi-Automated Test-Case Propagation in Fork Ecosystems," in International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER). IEEE, 2021, pp. 46--50.

[53]

S. Nielebock, R. Heumüller, J. Krüger, and F. Ortmeier, "Cooperative API Misuse Detection Using Correction Rules," in International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER). ACM, 2020, pp. 73--76.

[54]

L. Nyman and T. Mikkonen, "To Fork or Not to Fork: Fork Motivations in SourceForge Projects," International Journal of Open Source Software and Processes, vol. 3, no. 3, pp. 1--9, 2011.

Digital Library

[55]

C. Pacheco and M. D. Ernst, "Randoop: Feedback-Directed Random Testing for Java," in International Conference on Object Oriented Programming Systems Languages & Applications (OOPSLA). ACM, 2007, pp. 815--816.

[56]

H. Passier, L. Bijlsma, and C. Bockisch, "Maintaining Unit Tests During Refactoring," in International Conference on Principles and Practices of Programming on the Java Platform (PPPJ). ACM, 2016, pp. 1--6.

[57]

A. Rau, J. Hotzkow, and A. Zeller, "Transferring Tests Across Web Applications," in International Conference on Web Engineering (ICWE). Springer, 2018, pp. 50--64.

[58]

B. Ray, M. Kim, S. Person, and N. Rungta, "Detecting and Characterizing Semantic Inconsistencies in Ported Code," in International Conference on Automated Software Engineering (ASE). IEEE, 2013, pp. 367--377.

[59]

L. Ren, "Automated Patch Porting Across Forked Projects," in Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 2019, pp. 1199--1201.

[60]

C. K. Roy and J. R. Cordy, "A Survey on Software Clone Detection Research," Queen's University at Kingston, Tech. Rep. 2007-541, 2007.

[61]

C. K. Roy, J. R. Cordy, and R. Koschke, "Comparison and Evaluation of Code Clone Detection Techniques and Tools: A Qualitative Approach," Science of Computer Programming, vol. 74, no. 7, pp. 470--495, 2009.

Digital Library

[62]

J. Rubin, K. Czarnecki, and M. Chechik, "Managing Cloned Variants: A Framework and Experience," in International Software Product Line Conference (SPLC). ACM, 2013, pp. 101--110.

[63]

J. Rubin, A. Kirshin, G. Botterweck, and M. Chechik, "Managing Forked Product Variants," in International Software Product Line Conference (SPLC). ACM, 2012, pp. 156--160.

[64]

S. Schulze, J. Krüger, and J. Wünsche, "Towards Developer Support for Merging Forked Test Cases," in International Systems and Software Product Line Conference (SPLC). ACM, 2022, pp. 131--141.

[65]

D. Serra, G. Grano, F. Palomba, F. Ferrucci, H. C. Gall, and A. Bacchelli, "On the Effectiveness of Manual and Automatic Unit Test Generation: Ten Years Later," in International Coference on Mining Software Repositories (MSR). IEEE, 2019, pp. 121--125.

[66]

S. Shamshiri, "Automated Unit Test Generation for Evolving Software," in Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 2015, pp. 1038--1041.

[67]

R. S. Shariffdeen, S. H. Tan, M. Gao, and A. Roychoudhury, "Automated Patch Transplantation," ACM Transactions on Software Engineering and Methodology, vol. 30, no. 1, pp. 6:1--36, 2020.

Digital Library

[68]

G. S. Sodhi and D. Rattan, "An Insight on Software Features Supporting Software Transplantation: A Systematic Review," Archives of Computational Methods in Engineering, pp. 1--38, 2021.

[69]

D. Spadini, M. Aniche, and A. Bacchelli, "PyDriller: Python Framework for Mining Software Repositories," in Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 2018, pp. 908--911.

[70]

Ş. Stănciulescu, S. Schulze, and A. Wąsowski, "Forked and Integrated Variants in an Open-Source Firmware Project," in International Conference on Software Maintenance and Evolution (ICSME). IEEE, 2015, pp. 151--160.

[71]

T. A. Standish, "An Essay on Software Reuse," IEEE Transactions on Software Engineering, vol. SE-10, no. 5, pp. 494--497, 1984.

Digital Library

[72]

M. Staples and D. Hill, "Experiences Adopting Software Product Line Development without a Product Line Architecture," in Asia-Pacific Software Engineering Conference (APSEC). IEEE, 2004, pp. 176--183.

[73]

D. Strüber, M. Mukelabai, J. Krüger, S. Fischer, L. Linsbauer, J. Martinez, and T. Berger, "Facing the Truth: Benchmarking the Techniques for the Evolution of Variant-Rich Systems," in International Systems and Software Product Line Conference (SPLC). ACM, 2019, pp. 177--188.

[74]

J. Svajlenko and C. K. Roy, "Evaluating Clone Detection Tools With Bigclonebench," in International Conference on Software Maintenance and Evolution (ICSME). IEEE, 2015, pp. 131--140.

[75]

R. Tiarks, "What Maintenance Programmers Really Do: An Observational Study," in Workshop on Software Reengineering (WSR). University of Siegen, 2011, pp. 36--37.

[76]

A. van Deursen, L. M. F. Moonen, A. van den Bergh, and G. Kok, "Refactoring Test Code," CWI, Tech. Rep. SEN-R0119, 2001.

[77]

B. Van Rompaey and S. Demeyer, "Establishing Traceability Links Between Unit Test Cases and Units Under Test," in European Conference on Software Maintenance and Reengineering (CSMR). IEEE, 2009, pp. 209--218.

[78]

A. von Mayrhauser and A. M. Vans, "Program Comprehension During Software Maintenance and Evolution," Computer, vol. 28, no. 8, pp. 44--55, 1995.

Digital Library

[79]

A. Wąsowski and T. Berger, Software Product Lines. Springer International Publishing, 2023, pp. 395--435. [Online].

[80]

T. Winters, T. Manshreck, and H. Wright, Software Engineering at Google: Lessons Learned from Programming Over Time. O'Reilly, 2020.

[81]

Z. Xu and G. Rothermel, "Directed Test Suite Augmentation," in Asia-Pacific Software Engineering Conference (APSEC). IEEE, 2009, pp. 406--413.

[82]

V. G. Yusifoğlu, Y. Amannejad, and A. B. Can, "Software TestCode Engineering: A Systematic Mapping," Information and Software Technology, vol. 58, pp. 123--147, 2015.

[83]

T. Zhang and M. Kim, "Automated Transplantation and Differential Testing for Clones," in International Conference on Software Engineering (ICSE). IEEE, 2017, pp. 665--676.

[84]

Y. Zhao, J. Chen, A. Sejfia, M. Schmitt Laser, J. Zhang, F. Sarro, M. Harman, and N. Medvidovic, "FrUITeR: A Framework for Evaluating UI Test Reuse," in Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 2020, pp. 1190--1201.

[85]

S. Zhou, B. Vasilescu, and C. Kästner, "How Has Forking Changed in the Last 20 Years? A Study of Hard Forks on GitHub," in International Conference on Software Engineering (ICSE). ACM, 2020, pp. 445--456.

Index Terms

To Share, or Not to Share: Exploring Test-Case Reusability in Fork Ecosystems
1. Human-centered computing
  1. Visualization
    1. Visualization techniques
2. Software and its engineering
  1. Software creation and management
    1. Software development techniques
    2. Software verification and validation
      1. Software defect analysis
        Software testing and debugging
  2. Software notations and tools
    1. Software libraries and repositories
    2. Software maintenance tools

Index terms have been assigned to the content through auto-classification.

Recommendations

Semi-automated test-case propagation in fork ecosystems
ICSE-NIER '21: Proceedings of the 43rd International Conference on Software Engineering: New Ideas and Emerging Results

Forking provides a flexible and low-cost strategy for developers to adapt an existing project to new requirements, for instance, when addressing different market segments, hardware constraints, or runtime environments. Then, small ecosystems of forked ...
A new reusability metric for object-oriented software

Software reuse is big business. Managers of software houses know that reuse can bring financial rewards to their company, so long as effective reuse procedures are in place. Many companies are now starting their own reuse libraries. However, how can ...
Reusability in Programming: A Survey of the State of the Art

As programming passes the 30 year mark as a professional occupation, an increasingly large number of programs are in application areas that have been automated for many years. This fact is changing the technology base of commercial programming, and is ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ASE '23: Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering

November 2023

2161 pages

ISBN:9798350329964

General Chairs:
Tegawendé F. Bissyandé
University of Luxembourg, Luxembourg
,
Jacques Klein
University of Luxembourg, Luxembourg
,
Program Chairs:
Christian Bird
Microsoft Research, United States
,
Federica Sarro
University College London, United Kingdom

Sponsors

In-Cooperation

University of Luxembourg: University of Luxembourg
IEEE CS

Publisher

IEEE Press

Publication History

Published: 26 September 2024

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ASE '23

Sponsor:

ASE '23: 38th IEEE/ACM International Conference on Automated Software Engineering

November 11 - 15, 2023

Echternach, Luxembourg

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
7
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)0

Reflects downloads up to 28 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten