research-article

Challenges with applying vulnerability prediction models

Authors:

Patrick Morrison,

Brendan Murphy,

Laurie WilliamsAuthors Info & Claims

HotSoS '15: Proceedings of the 2015 Symposium and Bootcamp on the Science of Security

Article No.: 4, Pages 1 - 9

https://doi.org/10.1145/2746194.2746198

Published: 21 April 2015 Publication History

Abstract

Vulnerability prediction models (VPM) are believed to hold promise for providing software engineers guidance on where to prioritize precious verification resources to search for vulnerabilities. However, while Microsoft product teams have adopted defect prediction models, they have not adopted vulnerability prediction models (VPMs). The goal of this research is to measure whether vulnerability prediction models built using standard recommendations perform well enough to provide actionable results for engineering resource allocation. We define 'actionable' in terms of the inspection effort required to evaluate model results. We replicated a VPM for two releases of the Windows Operating System, varying model granularity and statistical learners. We reproduced binary-level prediction precision (~0.75) and recall (~0.2). However, binaries often exceed 1 million lines of code, too large to practically inspect, and engineers expressed preference for source file level predictions. Our source file level models yield precision below 0.5 and recall below 0.2. We suggest that VPMs must be refined to achieve actionable performance, possibly through security-specific metrics.

References

[1]

Zimmermann, T., Nagappan, N., and Williams, L. Searching for a Needle in a Haystack: Predicting Security Vulnerabilities for Windows Vista. In Software Testing, Verification and Validation (ICST), 2010 Third International Conference on (2010), 421--428.

Digital Library

[2]

Howard, M. and Lipner, S. The Security Development Lifecycle. Microsoft Press, 2006.

Digital Library

[3]

Basili, V. R., Briand, L. C., and Melo, W. L. A validation of object-oriented design metrics as quality indicators. Software Engineering, IEEE Transactions on, 22 (1996), 751--761.

Digital Library

[4]

Emam, K. E., Melo, W., and Machado, J. C. The prediction of faulty classes using object-oriented design metrics. J. Syst. Softw., 56 (feb 2001), 63--75.

Digital Library

[5]

Nagappan, N. and Ball, T. Use of relative code churn measures to predict system defect density. In Software Engineering, 2005. ICSE 2005. Proceedings. 27th International Conference on (2005), 284--292.

Digital Library

[6]

Gegick, M., Williams, L., Osborne, J., and Vouk, M. Prioritizing software security fortification throughcode-level metrics. In Proceedings of the 4th ACM workshop on Quality of protection (2008), ACM, 31--38.

Digital Library

[7]

Shin, Y. and Williams, L. Can traditional fault prediction models be used for vulnerability prediction? Empirical Software Engineering, 18 (2013), 25--59.

[8]

Neuhaus, S., Zimmermann, T., Holler, C., and Zeller, A. Predicting vulnerable software components. In Proceedings of the 14th ACM conference on Computer and communications security (2007), ACM, 529--540.

Digital Library

[9]

Zimmermann, T. and Nagappan, N. Predicting defects using network analysis on dependency graphs. In Proceedings of the 30th international conference on Software engineering (2008), ACM, 531--540.

Digital Library

[10]

Arisholm, E. and Briand, L. C. Predicting Fault-prone Components in a Java Legacy System. In Proceedings of the 2006 ACM/IEEE International Symposium on Empirical Software Engineering (2006), ACM, 8--17.

Digital Library

[11]

Mende, T. and Koschke, R. Revisiting the Evaluation of Defect Prediction Models. In Proceedings of the 5th International Conference on Predictor Models in Software Engineering (2009), ACM, 7:1--7:10.

Digital Library

[12]

Menzies, T., Greenwald, J., and Frank, A. Data Mining Static Code Attributes to Learn Defect Predictors. IEEE Transactions on Software Engineering, 33 (2007), 2--13.

Digital Library

[13]

D'Ambros, M., Lanza, M., and Robbes, R. Evaluating defect prediction approaches: a benchmark and an extensive comparison. Empirical Softw. Engg., 17 (aug 2012), 531--577.

Digital Library

[14]

Hall, T., Beecham, S., Bowes, D., Gray, D., and Counsell, S. A Systematic Literature Review on Fault Prediction Performance in Software Engineering. Software Engineering, IEEE Transactions on, 38 (2012), 1276--1304.

Digital Library

[15]

Moser, R., Pedrycz, W., and Succi, G. A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In Proceedings of the 30th international conference on Software engineering (2008), ACM, 181--190.

Digital Library

[16]

Pinzger, M., Nagappan, N., and Murphy, B. Can developer-module networks predict failures? In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering (2008), ACM, 2--12.

Digital Library

[17]

Nagappan, N., Murphy, B., and Basili, V. The Influence of Organizational Structure on Software Quality: An Empirical Case Study. In Proceedings of the 30th International Conference on Software Engineering (2008), ACM, 521--530.

Digital Library

[18]

Hassan, A. E. Predicting faults using the complexity of code changes. In Proceedings of the 31st International Conference on Software Engineering (2009), IEEE Computer Society, 78--88.

Digital Library

[19]

Herzig, K. and Zeller, A. Mining cause-effect-chains from version histories. In Software Reliability Engineering (ISSRE), 2011 IEEE 22nd International Symposium on (2011), 60--69.

Digital Library

[20]

Herzig, K., Just, S., Rau, A., and Zeller, A. Predicting Defects Using Change Genealogies. In Proceedings of the 2013 IEEE 24nd International Symposium on Software Reliability Engineering (2013), IEEE Computer Society.

[21]

Hovsepyan, A., Scandariato, R., Joosen, W., and Walden, J. Software Vulnerability Prediction Using Text Analysis Techniques. In Proceedings of the 4th International Workshop on Security Measurements and Metrics (2012), ACM, 7--10.

Digital Library

[22]

Shin, Y., Meneely, A., Williams, L., and Osborne, J. A. Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities. Software Engineering, IEEE Transactions on, 37 (2011), 772--787.

Digital Library

[23]

Chowdhury, I. and Zulkernine, M. Using complexity, coupling, and cohesion metrics as early indicators of vulnerabilities. Journal of Systems Architecture, 57 (2011), 294--313.

Digital Library

[24]

Giger, E., D'Ambros, M., Pinzger, M., and Gall, H. C. Method-level bug prediction. In Empirical Software Engineering and Measurement (ESEM), 2012 ACM-IEEE International Symposium on (2012), 171--180.

Digital Library

[25]

Menzies, T., Dekhtyar, A., Distefano, J., and Greenwald, J. Problems with Precision: A Response to Comments on Data Mining Static Code Attributes to Learn Defect Predictors. Software Engineering, IEEE Transactions on, 33 (2007), 637--640.

Digital Library

[26]

Premraj, R. and Herzig, K. Network Versus Code Metrics to Predict Defects: A Replication Study. In Proceedings of the 2011 International Symposium on Empirical Software Engineering and Measurement (2011), IEEE Computer Society, 215--224.

Digital Library

[27]

Weyuker, E., Ostrand, T., and Bell, R. Comparing the effectiveness of several modeling methods for fault prediction. Empirical Software Engineering, 15, 277--295.

Digital Library

[28]

Czerwonka, J., Nagappan, N., Schulte, W., and Murphy, B. CODEMINE: Building a Software Development Data Analytics Platform at Microsoft. Software, IEEE, 30, 4 (2013), 64--71.

Digital Library

[29]

Four Grand Challenges in Trustworthy Computing., 2003.

[30]

Team, R. D. C. R: A Language and Environment for Statistical Computing., 2010. R Foundation for Statistical Computing.

[31]

Venables, W. N. and Ripley, B. D. Modern Applied Statistics with S. Fourth Edition. Springer, 2002.

Digital Library

[32]

Pearson, K. LIII. On lines and planes of closest fit to systems of points in space. Philosophical Magazine Series 6, 2 (1901), 559--572.

[33]

Kuhn, M. caret: Classification and Regression Training., 2011.

[34]

Witten, I. H. and Frank, E. Data mining: practical machine learning tools and techniques with Java implementations. SIGMOD Rec., 31 (mar 2002), 76--77.

Digital Library

[35]

Friedman, J., Hastie, T., and Tibshirani, R. The Elements of Statistical Learning. Springer Publishing Company, Incorporated, 2009.

[36]

Nagappan, N., Ball, T., and Zeller, A. Mining Metrics to Predict Component Failures. In Proceedings of the 28th International Conference on Software Engineering (2006), ACM, 452--461.

Digital Library

[37]

Dowd, M., McDonald, J., and Schuh, J. The Art of Software Security Assessment: Identifying and Preventing Software Vulnerabilities. Addison-Wesley Professional, 2006.

Digital Library

[38]

Smith, B. and Williams, L. Using SQL Hotspots in a Prioritization Heuristic for Detecting All Types of Web Application Vulnerabilities. In Software Testing, Verification and Validation (ICST), 2011 IEEE Fourth International Conference on (March 2011), 220--229.

Digital Library

[39]

Chawla, N. V. C4. 5 and imbalanced data sets: investigating the effect of sampling method, probabilistic estimate, and decision tree structure. In Proceedings of the ICML (2003).

[40]

Beautiful Evidence. 2006.

Digital Library

Cited By

Soud MNuutinen WLiebel G(2025)Sóley: Automated detection of logic vulnerabilities in Ethereum smart contracts using large language modelsJournal of Systems and Software10.1016/j.jss.2025.112406226(112406)Online publication date: Aug-2025
https://doi.org/10.1016/j.jss.2025.112406
Bagheri AHegedűs P(2024)Towards a Block-Level Conformer-Based Python Vulnerability DetectionSoftware10.3390/software30300163:3(310-327)Online publication date: 31-Jul-2024
https://doi.org/10.3390/software3030016
Aladics THegedűs PFerenc R(2024)A Comparative Study of Commit Representations for JIT Vulnerability PredictionComputers10.3390/computers1301002213:1(22)Online publication date: 11-Jan-2024
https://doi.org/10.3390/computers13010022
Show More Cited By

Index Terms

Challenges with applying vulnerability prediction models

Recommendations

Searching for a Needle in a Haystack: Predicting Security Vulnerabilities for Windows Vista
ICST '10: Proceedings of the 2010 Third International Conference on Software Testing, Verification and Validation

Many factors are believed to increase the vulnerability of software system; for example, the more widely deployed or popular is a software system the more likely it is to be attacked. Early identification of defects has been a widely investigated topic ...
Evaluating the applicability of reliability prediction models between different software
IWPSE '02: Proceedings of the International Workshop on Principles of Software Evolution

The prediction of fault-prone modules in a large software system is an important part in software evolution. Since prediction models in past studies have been constructed and used for individual systems, it has not been practically investigated whether ...
Are Slice-Based Cohesion Metrics Actually Useful in Effort-Aware Post-Release Fault-Proneness Prediction? An Empirical Study
Background. Slice-based cohesion metrics leverage program slices with respect to the output variables of a module to quantify the strength of functional relatedness of the elements within the module. Although slice-based cohesion metrics have been ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

HotSoS '15: Proceedings of the 2015 Symposium and Bootcamp on the Science of Security

April 2015

170 pages

ISBN:9781450333764

DOI:10.1145/2746194

General Chair:
David Nicol
University of Illinois at Urbana-Champaign

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

US Army Research Office: US Army Research Office
NSF: National Science Foundation
University of Illinois at Urbana-Champaign
National Security Agency: National Security Agency

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 April 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Security Agency

Conference

HotSoS '15

Sponsor:

US Army Research Office
NSF
National Security Agency

HotSoS '15: Symposium and Bootcamp on the Science of Security

April 21 - 22, 2015

Illinois, Urbana

Acceptance Rates

HotSoS '15 Paper Acceptance Rate 13 of 22 submissions, 59%;

Overall Acceptance Rate 34 of 60 submissions, 57%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

136
Total Citations
View Citations
864
Total Downloads

Downloads (Last 12 months)74
Downloads (Last 6 weeks)8

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Soud MNuutinen WLiebel G(2025)Sóley: Automated detection of logic vulnerabilities in Ethereum smart contracts using large language modelsJournal of Systems and Software10.1016/j.jss.2025.112406226(112406)Online publication date: Aug-2025
https://doi.org/10.1016/j.jss.2025.112406
Bagheri AHegedűs P(2024)Towards a Block-Level Conformer-Based Python Vulnerability DetectionSoftware10.3390/software30300163:3(310-327)Online publication date: 31-Jul-2024
https://doi.org/10.3390/software3030016
Aladics THegedűs PFerenc R(2024)A Comparative Study of Commit Representations for JIT Vulnerability PredictionComputers10.3390/computers1301002213:1(22)Online publication date: 11-Jan-2024
https://doi.org/10.3390/computers13010022
Bagheri AHegedűs P(2024)Towards a Block-Level ML-Based Python Vulnerability Detection ToolActa Cybernetica10.14232/actacyb.29966726:3(323-371)Online publication date: 22-Jul-2024
https://doi.org/10.14232/actacyb.299667
Le TBabar M(2024)Automatic Data Labeling for Software Vulnerability Prediction Models: How Far Are We?Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3686675(131-142)Online publication date: 24-Oct-2024
https://dl.acm.org/doi/10.1145/3674805.3686675
Iannone ESellitto GIaccarino EFerrucci FDe Lucia APalomba F(2024)Early and Realistic Exploitability Prediction of Just-Disclosed Software Vulnerabilities: How Reliable Can It Be?ACM Transactions on Software Engineering and Methodology10.1145/365444333:6(1-41)Online publication date: 27-Jun-2024
https://dl.acm.org/doi/10.1145/3654443
Akshar TSingh VMurthy NKrishna AKumar L(2024)A Codebert Based Empirical Framework for Evaluating Classification-Enabled Vulnerability Prediction ModelsProceedings of the 17th Innovations in Software Engineering Conference10.1145/3641399.3641405(1-11)Online publication date: 22-Feb-2024
https://dl.acm.org/doi/10.1145/3641399.3641405
Zhu XLiu SJolfaei A(2024)A Fuzzing Method for Security Testing of SensorsIEEE Sensors Journal10.1109/JSEN.2023.330151724:5(5522-5529)Online publication date: 1-Mar-2024
https://doi.org/10.1109/JSEN.2023.3301517
Chen YLin ZGuo Z(2024)Application of Hierarchical Attention Network in Vulnerability Detection Model2024 2nd International Conference on Big Data and Privacy Computing (BDPC)10.1109/BDPC59998.2024.10649344(43-48)Online publication date: 10-Jan-2024
https://doi.org/10.1109/BDPC59998.2024.10649344
Zagane MAlenezi M(2024)Enhancing Software Co-Change Prediction: Leveraging Hybrid Approaches for Improved AccuracyIEEE Access10.1109/ACCESS.2024.339910112(68441-68452)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3399101
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten