Abstract
The increasing interest in open source software has led to the emergence of large language-specific package distributions of reusable software libraries, such as npm and RubyGems. These software packages can be subject to vulnerabilities that may expose dependent packages through explicitly declared dependencies. Using Snyk’s vulnerability database, this article empirically studies vulnerabilities affecting npm and RubyGems packages. We analyse how and when these vulnerabilities are disclosed and fixed, and how their prevalence changes over time. We also analyse how vulnerable packages expose their direct and indirect dependents to vulnerabilities. We distinguish between two types of dependents: packages distributed via the package manager, and external GitHub projects depending on npm packages. We observe that the number of vulnerabilities in npm is increasing and being disclosed faster than vulnerabilities in RubyGems. For both package distributions, the time required to disclose vulnerabilities is increasing over time. Vulnerabilities in npm packages affect a median of 30 package releases, while this is 59 releases in RubyGems packages. A large proportion of external GitHub projects is exposed to vulnerabilities coming from direct or indirect dependencies. 33% and 40% of dependency vulnerabilities to which projects and packages are exposed, respectively, have their fixes in more recent releases within the same major release range of the used dependency. Our findings reveal that more effort is needed to better secure open source package distributions.
Similar content being viewed by others
Notes
If n different tests are carried out over the same dataset, for each individual test one can only reject H0 if \(p< \frac {0.05}{n}\). In our case n = 48, i.e., p < 0.001.
R2 ∈ [0, 1] and the closer to 1 the better the model fits the data.
According to libraries.io, in May 2021, npm contained 1.79M packages compared to “only” 173K packages in RubyGems.
We implicitly assume here that the first unaffected release is the one containing the fix.
This analysis included Malicious Package vulnerabilities
The two categories of directly and indirectly exposed package releases are non-exclusive.
The two categories of directly and indirectly exposed projects are non-exclusive.
Top-level packages are packages that do not have any dependent packages themselves.
References
Agresti A, Coull BA (1998) Approximate is better than “exact” for interval estimation of binomial proportions. The American Statistician 52 (2):119–126
Alexopoulos N, Meneely A, Arnouts D, Mühlhäuser M. (2021) Who are vulnerability reporters? a large-scale empirical study on floss. In: Proceedings of the 15th ACM/IEEE international symposium on empirical software engineering and measurement (ESEM), pp 1–12
Alfadel M, Costa DE, Shihab E (2021) Empirical analysis of security vulnerabilities in Python packages. In: International conference on software analysis, evolution and reengineering. IEEE
Aranovich R, Wu M, Yu D, Katsy K, Ahmadnia K, Bishop M, Filkov V, Sagae K (2021) Beyond nvd: Cybersecurity meets the semantic web
Birsan A (2021) Dependency confusion: How I hacked into Apple, Microsoft and dozens of other companies. https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610. Accessed 7 May 2021
Bogart C, Kästner C., Herbsleb J, Thung F (2016) How to break an API: Cost negotiation and community values in three software ecosystems. In: Int’l Symp foundations of software engineering (FSE). ACM, pp 109–120
Bogart C, Kästner C, Herbsleb J, Thung F (2021) When and how to make breaking changes: Policies and practices in 18 open source software ecosystems. ACM Trans. Softw. Eng. Methodol., 30(4)
Chinthanet B, Ponta SE, Plate H, Sabetta A, Kula RG, Ishio T, Matsumoto K (2020) Code-based vulnerability detection in Node. js applications: How far are we?. In: International conference on automated software engineering (ASE). IEEE, pp 1199–1203
Cox J, Bouwers E, Eekelen M, Visser J (2015) Measuring dependency freshness in software systems. In: International conference on software engineering. IEEE Press, pp 109–118
Cox J, Bouwers E, van Eekelen M, Visser J (2015) Measuring dependency freshness in software systems. In: International Conference on Software Engineering, pp 109–118
Dashevskyi S, Brucker AD, Massacci F (2018) A screening test for disclosed vulnerabilities in foss components. IEEE Trans Softw Eng 45(10):945–966
Decan A, Mens T (2019) What do package dependencies tell us about semantic versioning?. IEEE Transactions on Software Engineering
Decan A, Mens T, Claes M (2017) An empirical comparison of dependency issues in OSS packaging ecosystems. In: International conference on software analysis, evolution and reengineering. IEEE, pp 2–12
Decan A, Mens T, Constantinou E (2018) On the evolution of technical lag in the npm package dependency network. In: Int’l Conf software maintenance and evolution. IEEE, pp 404–414
Decan A, Mens T, Constantinou E (2018) On the impact of security vulnerabilities in the npm package dependency network. In: International conference on mining software repositories
Decan A, Mens T, Grosjean P (2019) An empirical comparison of dependency network evolution in seven software packaging ecosystems. Empir Softw Eng 24(1):381–416
Decan A, Mens T, Zerouali A, Roover CD (2021) Back to the past–analysing backporting practices in package dependency networks. IEEE Transactions on Software Engineering
Gkortzis A, Feitosa D, Spinellis D (2020) Software reuse cuts both ways: An empirical analysis of its relationship with security vulnerabilities. Journal of Systems and Software
Gonzalez-Barahona JM, Sherwood P, Robles G, Izquierdo D (2017) Technical lag in software compilations: Measuring how outdated a software deployment is. In: IFIP international conference on open source systems. Springer, pp 182–192
Imtiaz N, Thorne S, Williams L (2021) A comparative study of vulnerability reporting by software composition analysis tools. arXiv preprint arXiv:2108.12078
Katz J (2020) Libraries.io Open Source Repository and Dependency Metadata
Kikas R, Gousios G, Dumas M, Pfahl D (2017) Structure and evolution of package dependency networks. In: International conference on mining software repositories (MSR). IEEE, pp 102–112
Klein JP, Moeschberger ML (2013) Survival Analysis: Techniques for Censored and Truncated Data. Springer, Berlin
Lauinger T, Chaabane A, Arshad S, Robertson W, Wilson C, Kirda E (2017) Thou shalt not depend on me: Analysing the use of outdated JavaScript libraries on the web. In: NDSS symposium
Maillart T, Zhao M, Grossklags J, Chuang J (2017) Given enough eyeballs, all bugs are shallow? revisiting eric raymond with bug bounty programs. Journal of Cybersecurity 3(2):81–90
Massacci F, Pashchenko I (2021) Technical leverage in a software ecosystem: Development opportunities and security risks. In: 2021 IEEE/ACM 43rd international conference on software engineering (ICSE). IEEE, pp 1386–1397
Meneely A, Srinivasan H, Musa A, Tejeda AR, Mokary M, Spates B (2013) When a patch goes bad: Exploring the properties of vulnerability-contributing commits. In: 2013 ACM/IEEE international symposium on empirical software engineering and measurement. IEEE, pp 65–74
Mujahid S, Costa DE, Abdalkareem R, Shihab E, Saied MA, Adams B (2021) Towards using package centrality trend to identify packages in decline. arXiv preprint arXiv:2107.10168
Nguyen VH, Dashevskyi S, Massacci F (2016) An automatic method for assessing the versions affected by a vulnerability. Empir Softw Eng 21 (6):2268–2297
Nguyen DC, Derr E, Backes M, Bugiel S (2020) Up2dep: Android tool support to fix insecure code dependencies. In: Annual Computer Security Applications Conference, pp 263–276
OWASP (2017) Owasp top ten web application security risks. https://owasp.org/www-project-top-ten/, accessed: 24/04/2021
Ohm M, Plate H, Sykosch A, Meier M (2020) Backstabber’s knife collection: A review of open source software supply chain attacks. In: International conference on detection of intrusions and malware, and vulnerability assessment. Springer, pp 23–43
Ozment A, Schechter SE (2006) Milk or wine: does software security improve with age? In. USENIX Security Symposium 6:10–5555
Pashchenko I, Duc-Ly V, Massacci F (2020) A qualitative study of dependency management and its security implications. In: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, pp 1513–1531
Pashchenko I, Plate H, Ponta SE, Sabetta A, Massacci F (2018) Vulnerable open source dependencies: Counting those that matter. In: International symposium on empirical software engineering and measurement. ACM
Pashchenko I, Plate H, Ponta SE, Sabetta A, Massacci F (2020) Vuln4real: A methodology for counting actually vulnerable dependencies. IEEE Transactions on Software Engineering
Pham NH, Nguyen TT, Nguyen HA, Wang X, Nguyen AT, Nguyen TN (2010) Detecting recurring and similar software vulnerabilities. In: Int’l Conf software engineering, pp 227–230
Ponta SE, Plate H, Sabetta A (2020) Detection, assessment and mitigation of vulnerabilities in open source dependencies. Empir Softw Eng 25 (5):3175–3215
Prana GAA, Sharma A, Shar LK, Foo D, Santosa A, Sharma A, Lo D (2021) Out of sight, out of mind? How vulnerable dependencies affect open-source projects. Empirical Software Engineering, 26
Preston-Werner T (2013) Semantic versioning 2.0.0. https://semver.org/
Romano J, Kromrey JD, Coraggio J, Skowronek J, Devine L (2006) Exploring methods for evaluating group differences on the NSSE and other surveys: Are the t-test and Cohen’s d indices the most appropriate choices?. In: Annual Meeting of the Southern Association for Institutional Research
Ruohonen J (2018) An empirical analysis of vulnerabilities in Python packages for web applications. In: International workshop on empirical software engineering in practice (IWESEP). IEEE, pp 25–30
Shin Y, Meneely A, Williams L, Osborne JA (2010) Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE Trans Softw Eng 37(6):772–787
Snyk (2017) The state of open source security. https://snyk.io/wp-content/uploads/The-State-of-Open-Source-2017.pdfhttps://snyk.io/wp-content/uploads/The-State-of-Open-Source-2017.pdf, accessed: 10/06/2021
Soto-Valero C, Harrand N, Monperrus M, Baudry B (2021) A comprehensive study of bloated dependencies in the maven ecosystem. Empir Softw Eng 26(3):1–44
Wittern E, Suter P, Rajagopalan S (2016) A look at the dynamics of the JavaScript package ecosystem. In: Int’l Conf mining software repositories (MSR). IEEE, pp 351–361
Wohlin C, Runeson P, Host M, Ohlsson MC, Regnell B, Wesslen A (2000) Experimentation in Software Engineering - An Introduction. Kluwer
Zapata RE, Kula RG, Chinthanet B, Ishio T, Matsumoto K, Ihara A (2018) Towards smoother library migrations: A look at vulnerable dependency migrations at function level for npm JavaScript packages. In: International conference on software maintenance and evolution. IEEE, pp 559–563
Zerouali J (2019) A Measurement Framework for Analyzing Technical Lag in Open-Source Software Ecosystems. PhD thesis, University of Mons
Zerouali A, Constantinou E, Mens T, Robles G, González-Barahona J (2018) An empirical analysis of technical lag in npm package dependencies. In: International conference on software reuse. Springer, pp 95–110
Zerouali A, Mens T, Decan A, Gonzalez-Barahona J, Robles G (2021a) A multi-dimensional analysis of technical lag in Debian-based Docker images. Empir Softw Eng 26(2):1–45
Zerouali A, Mens T, Robles G, Gonzalez-Barahona JM (2019) On the relation between outdated Docker containers, severity vulnerabilities, and bugs. In: International conference on software analysis, evolution and reengineering. IEEE, pp 491–501
Zerouali A, Mens T, Roover CD (2021b) On the usage of JavaScript, Python and Ruby packages in Docker Hub images. Science of Computer Programming, pp 102653
Zimmermann M, Staicu C-A, Tenny C, Pradel M (2019) Small world with high risks: A study of security threats in the npm ecosystem. In: USENIX security symposium, pp 995–1010
Acknowledgments
This research was partially funded by the Excellence of Science project 30446992 SECO-Assist financed by F.R.S.-FNRS and FWO-Vlaanderen, as well as FNRS Research Credit J015120 and FNRS Research Project T001718. We express our gratitude to the security team of Snyk for granting us permission to use their dataset of vulnerability reports for research purposes.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Jeffrey C. Carver
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zerouali, A., Mens, T., Decan, A. et al. On the impact of security vulnerabilities in the npm and RubyGems dependency networks. Empir Software Eng 27, 107 (2022). https://doi.org/10.1007/s10664-022-10154-1
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-022-10154-1