Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3510003.3510216acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Public Access

The extent of orphan vulnerabilities from code reuse in open source software

Published: 05 July 2022 Publication History

Abstract

Motivation: A key premise of open source software is the ability to copy code to other open source projects (white-box reuse). Such copying accelerates development of new projects, but the code flaws in the original projects, such as vulnerabilities, may also spread even if fixed in the projects from where the code was appropriated. The extent of the spread of vulnerabilities through code reuse, the potential impact of such spread, or avenues for mitigating risk of these secondary vulnerabilities has not been studied in the context of a nearly complete collection of open source code.
Aim: We aim to find ways to detect the white-box reuse induced vulnerabilities, determine how prevalent they are, and explore how they may be addressed.
Method: We rely on World of Code infrastructure that provides a curated and cross-referenced collection of nearly all open source software to conduct a case study of a few known vulnerabilities. To conduct our case study we develop a tool, VDiOS, to help identify and fix white-box-reuse-induced vulnerabilities that have been already patched in the original projects (orphan vulnerabilities).
Results: We find numerous instances of orphan vulnerabilities even in currently active and in highly popular projects (over 1K stars). Even apparently inactive projects are still publicly available for others to use and spread the vulnerability further. The often long delay in fixing orphan vulnerabilities even in highly popular projects increases the chances of it spreading to new projects. We provided patches to a number of project maintainers and found that only a small percentage accepted and applied the patch. We hope that VDiOS will lead to further study and mitigation of risks from orphan vulnerabilities and other orphan code flaws.

References

[1]
Sultan S. Alqahtani, Ellis E. Eghan, and Juergen Rilling. 2016. SV-AF --- A Security Vulnerability Analysis Framework. In 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE). 219--229.
[2]
S. Amreen, A. Karnauch, and A. Mockus. 2019. Developer Reputation Estimator (DRE). In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). 1082--1085.
[3]
Markus Borg, Oscar Svensson, Kristian Berg, and Daniel Hansson. 2019. SZZ Unleashed: An Open Implementation of the SZZ Algorithm - Featuring Example Usage in a Study of Just-in-Time Bug Prediction for the Jenkins Project. In Proceedings of the 3rd ACM SIGSOFT International Workshop on Machine Learning Techniques for Software Quality Evaluation (Tallinn, Estonia) (MaLTeSQuE 2019). Association for Computing Machinery, New York, NY, USA, 7--12.
[4]
Hudson Borges and Marco Tulio Valente. 2018. What's in a github star? understanding repository starring practices in a social coding platform. Journal of Systems and Software 146 (2018), 112--129.
[5]
Thomas Boutell. 1997. PNG (Portable Network Graphics) Specification Version 1.0. RFC 2083.
[6]
Mircea Cadariu, Eric Bouwers, Joost Visser, and Arie van Deursen. 2015. Tracking known security vulnerabilities in proprietary software systems. In 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER). 516--519.
[7]
Marco Carvalho, Jared DeMott, Richard Ford, and David A. Wheeler. 2014. Heartbleed 101. IEEE Security Privacy 12, 4 (2014), 63--67.
[8]
Alexandre Decan, Tom Mens, and Eleni Constantinou. 2018. On the Impact of Security Vulnerabilities in the Npm Package Dependency Network. In Proceedings of the 15th International Conference on Mining Software Repositories (Gothenburg, Sweden) (MSR '18). Association for Computing Machinery, New York, NY, USA, 181--191.
[9]
Dependabot. 2021. Github Dependabot. https://github.com/dependabot
[10]
Zakir Durumeric, Frank Li, James Kasten, Johanna Amann, Jethro Beekman, Mathias Payer, Nicolas Weaver, David Adrian, Vern Paxson, Michael Bailey, and J. Alex Halderman. 2014. The Matter of Heartbleed (IMC '14). Association for Computing Machinery, New York, NY, USA, 475--488.
[11]
Danilo Favato, Daniel Ishitani, Johnatan Oliveira, and Eduardo Figueiredo. 2019. Linus's Law: More Eyes Fewer Flaws in Open Source Projects. In Proceedings of the XVIII Brazilian Symposium on Software Quality (Fortaleza, Brazil) (SBQS'19). Association for Computing Machinery, New York, NY, USA, 69--78.
[12]
W.B. Frakes and Kyo Kang. 2005. Software reuse research: status and future. IEEE Transactions on Software Engineering 31, 7 (2005), 529--536.
[13]
Tanner Fry, Tapajit Dey, Andrey Karnauch, and Audris Mockus. 2020. A dataset and an approach for identity resolution of 38 million author ids extracted from 2b git commits. In Proceedings of the 17th international conference on mining software repositories. 518--522.
[14]
M. Gharehyazie, B. Ray, and V. Filkov. 2017. Some from Here, Some from There: Cross-Project Code Reuse in GitHub. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). 291--301.
[15]
Antonios Gkortzis, Daniel Feitosa, and Diomidis Spinellis. 2021. Software reuse cuts both ways: An empirical analysis of its relationship with security vulnerabilities. Journal of Systems and Software 172 (2021), 110653.
[16]
Glenn Randers-Pehrson. 2020. glennrp/libpng. https://github.com/glennrp/libpng
[17]
Jaap-Henk Hoepman and Bart Jacobs. 2007. Increased Security through Open Source. Commun. ACM 50, 1 (Jan. 2007), 79--83.
[18]
K. Inoue, Y. Sasaki, P. Xia, and Y. Manabe. 2012. Where does this code come from and where does it go? --- Integrated code history tracker for open source systems. In 2012 34th International Conference on Software Engineering (ICSE). 331--341.
[19]
K. Inoue, R. Yokomori, T. Yamamoto, M. Matsushita, and S. Kusumoto. 2005. Ranking significance of software components based on use relations. IEEE Transactions on Software Engineering 31, 3 (2005), 213--225.
[20]
T. Ishio, Y. Sakaguchi, K. Ito, and K. Inoue. 2017. Source File Set Search for Clone-and-Own Reuse Analysis. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). 257--268.
[21]
N. Kawamitsu, T. Ishio, T. Kanda, R. G. Kula, C. De Roover, and K. Inoue. 2014. Identifying Source Code Reuse across Repositories Using LCS-Based Source Code Similarity. In 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation. 305--314.
[22]
Charles W. Krueger. 1992. Software Reuse. ACM Comput. Surv. 24, 2 (June 1992), 131--183.
[23]
Nir Kshetri and Jeffrey Voas. 2019. Supply Chain Trust. IT Professional 21, 2 (2019), 6--10.
[24]
Elena Lyulina and Mahmoud Jahanshahi. 2021. Building the Collaboration Graph of Open-Source Software Ecosystem. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). 618--620.
[25]
Y. Ma, C. Bogart, S. Amreen, R. Zaretzki, and A. Mockus. 2019. World of Code: An Infrastructure for Mining the Universe of Open Source VCS Data. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). 143--154.
[26]
Yuxing Ma, Tapajit Dey, Chris Bogart, Sadika Amreen, Marat Valiev, Adam Tutko, David Kennard, Russell Zaretzki, and Audris Mockus. 2021. World of code: enabling a research workflow for mining and analyzing the universe of open source VCS data. Empirical Software Engineering 26 (2021).
[27]
Audris Mockus, Diomidis Spinellis, Zoe Kotti, and Gabriel John Dusing. 2020. A complete set of related git repositories identified via community detection approaches based on shared commits. In Proceedings of the 17th International Conference on Mining Software Repositories. 513--517.
[28]
Paul Myerson. 2017. Can't Turn Back Time: Cybersecurity Must Be Dealt With. https://www.industryweek.com/supply-chain/article/22006116/cant-turn-back-time-cybersecurity-must-be-dealt-with
[29]
Frank Nagle. 2019. Open Source Software and Firm Productivity. Management Science 65, 3 (2019), 1191--1215.
[30]
National Institute of Standards and Technology. 2021. National Vulnerability Database. http://nvd.nist.gov
[31]
OpenSSL. 2021. News/Vulnerabilities. https://www.openssl.org/news/vulnerabilities.html
[32]
OWASP. 2017. The Open Web Application Security Project OWASP Top 10. https://owasp.org/www-project-top-ten
[33]
Eric Raymond. 1999. The cathedral and the bazaar. Knowledge, Technology and Policy 12 (1999), 23--49.
[34]
Eric Rescorla. 2018. The Transport Layer Security (TLS) Protocol Version 1.3. RFC 8446.
[35]
Greg Roelofs. 2006. libpng.org. Retrieved August 4, 2020 from http://www.libpng.org
[36]
Crowe S, Cresswell K, Robertson A, Huby G, Avery A, and Sheikh A. 2011. The case study approach. BMC Medical Research Methodology (2011).
[37]
Slashdot Media. 2020. SourceForge. https://sourceforge.net
[38]
The Apache Software Foundation. 2021. Apache Maven Project. https://maven.apache.org/
[39]
The MITRE Corporation. 2014. CVE-2014-0160. https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2014-0160
[40]
The MITRE Corporation. 2017. CVE-2017-12652. https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2017-12652
[41]
The MITRE Corporation. 2021. Common Vulnerabilities and Exposures (CVE). https://cve.mitre.org/
[42]
The MITRE Corporation. 2021. CVE-2021-29482. https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-29482
[43]
The MITRE Corporation. 2021. CVE-2021-3449. https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-3449
[44]
Cosmin Truta and Glenn Randers-Pehrson. 2020. LIBPNG: PNG reference library. https://sourceforge.net/projects/libpng
[45]
Jun Wang, Mingyi Zhao, Qiang Zeng, Dinghao Wu, and Peng Liu. 2015. Risk Assessment of Buffer "Heartbleed" Over-Read Vulnerabilities. In 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.
[46]
Pei Xia, Yuki Manabe, Norihiro Yoshida, and Katsuro Inoue. 2012. Development of a Code Clone Search Tool for Open Source Repositories. Information and Media Technologies 7, 4 (2012), 1370--1376.
[47]
Pei Xia, Makoto Matsushita, Norihiro Yoshida, and Katsuro Inoue. 2014. Studying Reuse of Out-dated Third-party Code in Open Source Projects. Information and Media Technologies 9, 2 (2014), 155--161.

Cited By

View all
  • (2025)Understanding vulnerabilities in software supply chainsEmpirical Software Engineering10.1007/s10664-024-10581-230:1Online publication date: 1-Feb-2025
  • (2024)PyRadar: Towards Automatically Retrieving and Validating Source Code Repository Information for PyPI PackagesProceedings of the ACM on Software Engineering10.1145/36608221:FSE(2608-2631)Online publication date: 12-Jul-2024
  • (2024)Automating Zero-Shot Patch Porting for Hard ForksProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652134(363-375)Online publication date: 11-Sep-2024
  • Show More Cited By

Index Terms

  1. The extent of orphan vulnerabilities from code reuse in open source software

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICSE '22: Proceedings of the 44th International Conference on Software Engineering
    May 2022
    2508 pages
    ISBN:9781450392211
    DOI:10.1145/3510003
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    In-Cooperation

    • IEEE CS

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 July 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. CVE
    2. code reuse
    3. git
    4. security vulnerabilities

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ICSE '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 276 of 1,856 submissions, 15%

    Upcoming Conference

    ICSE 2025

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)264
    • Downloads (Last 6 weeks)49
    Reflects downloads up to 23 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Understanding vulnerabilities in software supply chainsEmpirical Software Engineering10.1007/s10664-024-10581-230:1Online publication date: 1-Feb-2025
    • (2024)PyRadar: Towards Automatically Retrieving and Validating Source Code Repository Information for PyPI PackagesProceedings of the ACM on Software Engineering10.1145/36608221:FSE(2608-2631)Online publication date: 12-Jul-2024
    • (2024)Automating Zero-Shot Patch Porting for Hard ForksProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652134(363-375)Online publication date: 11-Sep-2024
    • (2024)Dataset: Copy-based Reuse in Open Source SoftwareProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644868(42-47)Online publication date: 15-Apr-2024
    • (2024)Empirical Analysis of Vulnerabilities Life Cycle in Golang EcosystemProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639230(1-13)Online publication date: 20-May-2024
    • (2024)Strengthening Supply Chain Security with Fine-grained Safe Patch IdentificationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639104(1-12)Online publication date: 20-May-2024
    • (2023)Large Scale Study of Orphan Vulnerabilities in the Software Supply ChainProceedings of the 19th International Conference on Predictive Models and Data Analytics in Software Engineering10.1145/3617555.3617872(22-32)Online publication date: 8-Dec-2023
    • (2023)Third-Party Library Dependency for Large-Scale SCA in the C/C++ Ecosystem: How Far Are We?Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598143(1383-1395)Online publication date: 12-Jul-2023
    • (2023)Applying the Universal Version History Concept to Help De-Risk Copy-Based Code Reuse2023 IEEE 23rd International Working Conference on Source Code Analysis and Manipulation (SCAM)10.1109/SCAM59687.2023.00012(1-12)Online publication date: 2-Oct-2023
    • (2023)Cross Protocol Attack on IPSec-based VPN2023 11th International Symposium on Digital Forensics and Security (ISDFS)10.1109/ISDFS58141.2023.10131787(1-6)Online publication date: 11-May-2023

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media