Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3377811.3380442acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Impact analysis of cross-project bugs on software ecosystems

Published: 01 October 2020 Publication History

Abstract

Software projects are increasingly forming social-technical ecosystems within which individual projects rely on the infrastructures or functional components provided by other projects, leading to complex inter-dependencies. Through inter-project dependencies, a bug in an upstream project may have profound impact on a large number of downstream projects, resulting in cross-project bugs. This emerging type of bugs has brought new challenges in bug fixing due to their unclear influence on downstream projects. In this paper, we present an approach to estimating the impact of a cross-project bug within its ecosystem by identifying the affected downstream modules (classes/methods). Note that a downstream project that uses a buggy upstream function may not be affected as the usage does not satisfy the failure inducing preconditions. For a reported bug with the known root cause function and failure inducing preconditions, we first collect the candidate downstream modules that call the upstream function through an ecosystem-wide dependence analysis. Then, the paths to the call sites of the buggy upstream function are encoded as symbolic constraints. Solving the constraints, together with the failure inducing preconditions, identifies the affected downstream modules. Our evaluation of 31 existing upstream bugs on the scientific Python ecosystem containing 121 versions of 22 popular projects (with a total of 16 millions LOC) shows that the approach is highly effective: from the 25490 candidate downstream modules that invoke the buggy upstream functions, it identifies 1132 modules where the upstream bugs can be triggered, pruning 95.6% of the candidates. The technique has no false negatives and an average false positive rate of 7.9%. Only 49 downstream modules (out of the 1132 we found) were reported before to be affected.

References

[1]
Astropy. 2018. A community Python library for Astronomy. http://www.astropy.org/
[2]
Gabriele Bavota, Gerardo Canfora, Massimiliano Di Penta, Rocco Oliveto, and Sebastiano Panichella. 2015. How the Apache community upgrades dependencies: an evolutionary study. Empirical Software Engineering 20, 5 (oct 2015), 1275--1317.
[3]
Christopher Bogart, Christian Kästner, James Herbsleb, and Ferdian Thung. 2016. How to break an API: cost negotiation and community values in three software ecosystems. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 109--120.
[4]
Gleison Brito, Andre Hora, Marco Tulio Valente, and Romain Robbes. 2016. Do developers deprecate APIs with replacement messages? A large-scale analysis on Java systems. In Procedings of 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering. 360--369.
[5]
Gleison Brito, Andre Hora, Marco Tulio Valente, and Romain Robbes. 2018. On the use of replacement messages in API deprecation: An empirical study. Journal of Systems and Software 137 (2018), 306--321.
[6]
John Businge, Alexander Serebrenik, and Mark G.J. van den Brand. 2015. Eclipse API usage: the good and the bad. Software Quality Journal 23, 1 (2015), 107--141.
[7]
Gerardo Canfora, Luigi Cerulo, Marta Cimitile, and Massimiliano Di Penta. 2011. Social interactions around cross-system bug fixings: The case of FreeBSD and OpenBSD. In Proceedings of the 8th Working Conference on Mining Software Repositories. ACM Press, New York, New York, USA, 143--152.
[8]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT Solver. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4963 LNCS. 337--340.
[9]
Alexandre Decan, Tom Mens, Maëlick Claes, and Philippe Grosjean. 2016. When GitHub meets CRAN: An analysis of inter-repository package dependency problems. In Procedings of International Conference on Software Analysis, Evolution, and Reengineering. 493--504.
[10]
NumPy developers. 2018. NumPy. http://www.numpy.org/
[11]
SciPy developers. 2019. SciPy library. https://www.scipy.org/scipylib/index.html
[12]
The Matplotlib development team. 2018. Matplotlib: Python plotting. https://matplotlib.org/
[13]
Danny Dig and Ralph Johnson. 2006. How do APIs evolve? A story of refactoring. In Journal of Software Maintenance and Evolution, Vol. 18. 83--107.
[14]
Hui Ding, Wanwangying Ma, Lin Chen, Yuming Zhou, and Baowen Xu. 2017. An empirical study on downstream workarounds for cross-project bugs. In Proceedings of 2017 24th Asia-Pacific Software Engineering Conference. IEEE, 318--327.
[15]
Joseph Hejderup, Arie van Deursen, and Georgios Gousios. 2018. Software ecosystem call graph for dependency management. In Proceedings of the 40th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER '18). ACM, New York, NY, USA, 101--104.
[16]
Andre Hora, Romain Robbes, Nicolas Anquetil, Anne Etien, Stephane Ducasse, and Marco Tulio Valente. 2015. How do developers react to API evolution? The Pharo ecosystem case. In Proceedings of the 2015 IEEE International Conference on Software Maintenance and Evolution. IEEE, 251--260.
[17]
Slinger Jansen, Anthony Finkelstein, and Sjaak Brinkkemper. 2009. A sense of community: A research agenda for software ecosystems. In Proceedings of the 31st International Conference on Software Engineering - Companion Volume. 187--190.
[18]
Kamil Jezek, Jens Dietrich, and Premek Brada. 2015. How Java APIs break - An empirical study. Information and Software Technology 65, C (2015), 129--146.
[19]
Mario Linares-Vásquez, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, and Denys Poshyvanyk. 2014. How do API changes trigger stack overflow discussions? a study on the Android SDK. In Proceedings of the 22nd International Conference on Program Comprehension. 83--94.
[20]
Yuefei Liu. 2017. Understanding and Generating Patches for Bugs Introduced by Third-party Library Upgrades. Ph.D. Dissertation. http://hdl.handle.net/10012/12762
[21]
Wanwangying Ma, Lin Chen, Xiangyu Zhang, Yuming Zhou, and Baowen Xu. 2017. How do developers fix cross-project correlated bugs?: A case study on the GitHub scientific python ecosystem. In Proceedings of the 39th International Conference on Software Engineering. IEEE Press, Piscataway, NJ, USA, 381--392.
[22]
Tyler McDonnell, Baishakhi Ray, and Miryung Kim. 2013. An empirical study of API stability and adoption in the android ecosystem. In Proceedings of the 2013 IEEE International Conference on Software Maintenance. 70--79.
[23]
Pydata. 2019. Pandas: Python data analysis library. https://pandas.pydata.org/
[24]
Romain Robbes, Mircea Lungu, and David Röthlisberger. 2012. How do developers react to API deprecation? The case of a Smalltalk ecosystem. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering. 56:1--56:11.
[25]
Scikit-learn. 2018. Scikit-learn: Machine learning in Python. http://scikitlearn.github.io/stable
[26]
Chengnian Sun, David Lo, Xiaoyin Wang, Jing Jiang, and Siau-Cheng Khoo. 2010. A discriminative model approach for accurate duplicate bug report retrieval. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering. ACM, New York, NY, USA, 45--54.
[27]
Ying Wang, Ming Wen, Zhenwei Liu, Rongxin Wu, Rui Wang, Bo Yang, Hai Yu, Zhiliang Zhu, and Shing-Chi Cheung. 2018. Do the Dependency Conflicts in My Project Matter?. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2018). ACM, New York, NY, USA, 319--330.
[28]
Laerte Xavier, Aline Brito, Andre Hora, and Marco Tulio Valente. 2017. Historical and impact analysis of API breaking changes: A large-scale study. In Procedings of 24th IEEE International Conference on Software Analysis, Evolution, and Reengineering. 138--147.

Cited By

View all
  • (2024)Effective Vulnerable Function Identification based on CVE Description Empowered by Large Language ModelsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695013(393-405)Online publication date: 27-Oct-2024
  • (2024)I3DE: An IDE for Inspecting Inconsistencies in PL/SQL CodeProceedings of the 1st ACM/IEEE Workshop on Integrated Development Environments10.1145/3643796.3648461(74-75)Online publication date: 20-Apr-2024
  • (2024)Investigating user feedback from a crowd in requirements management in software ecosystemsEmpirical Software Engineering10.1007/s10664-024-10546-529:6Online publication date: 23-Sep-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering
June 2020
1640 pages
ISBN:9781450371216
DOI:10.1145/3377811
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • KIISE: Korean Institute of Information Scientists and Engineers
  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bug impact
  2. cross-project bugs
  3. dependence analysis
  4. software ecosystems
  5. symbolic constraints

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China
  • National Key R&D Program of China
  • NSF

Conference

ICSE '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)188
  • Downloads (Last 6 weeks)13
Reflects downloads up to 23 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Effective Vulnerable Function Identification based on CVE Description Empowered by Large Language ModelsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695013(393-405)Online publication date: 27-Oct-2024
  • (2024)I3DE: An IDE for Inspecting Inconsistencies in PL/SQL CodeProceedings of the 1st ACM/IEEE Workshop on Integrated Development Environments10.1145/3643796.3648461(74-75)Online publication date: 20-Apr-2024
  • (2024)Investigating user feedback from a crowd in requirements management in software ecosystemsEmpirical Software Engineering10.1007/s10664-024-10546-529:6Online publication date: 23-Sep-2024
  • (2023)Automatically Resolving Dependency-Conflict Building Failures via Behavior-Consistent Loosening of Library Version ConstraintsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616264(198-210)Online publication date: 30-Nov-2023
  • (2023)Effective Recommendation of Cross-Project Correlated Issues based on Issue MetricsProceedings of the 14th Asia-Pacific Symposium on Internetware10.1145/3609437.3609462(1-1)Online publication date: 4-Aug-2023
  • (2023)A Grounded Theory of Cross-Community SECOs: Feedback Diversity Versus SynchronizationIEEE Transactions on Software Engineering10.1109/TSE.2023.331387549:10(4731-4750)Online publication date: 18-Sep-2023
  • (2023)DGMF: Fast Generation of Comparable, Updatable Dependency Graphs for Software Repositories2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR)10.1109/MSR59073.2023.00028(115-119)Online publication date: May-2023
  • (2023)Understanding the Threats of Upstream Vulnerabilities to Downstream Projects in the Maven EcosystemProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00095(1046-1058)Online publication date: 14-May-2023
  • (2023)Just‐in‐time identification for cross‐project correlated issuesJournal of Software: Evolution and Process10.1002/smr.2637Online publication date: 26-Dec-2023
  • (2022)Toward Using Package Centrality Trend to Identify Packages in DeclineIEEE Transactions on Engineering Management10.1109/TEM.2021.312201269:6(3618-3632)Online publication date: Dec-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media