Nothing Special   »   [go: up one dir, main page]

Skip to main content

CVEjoin: An Information Security Vulnerability and Threat Intelligence Dataset

  • Conference paper
  • First Online:
Advanced Information Networking and Applications (AINA 2023)

Abstract

The risk of exploiting information security vulnerabilities should not be determined solely by a single metric, such as the Common Vulnerability Scoring System (CVSS). This approach disregards the global threat landscape and the vulnerable asset. Therefore, in addition to using traditional Vulnerability Management (VM) tools, analysts and researchers must manually curate datasets containing threat intelligence and context-specific information about security flaws. However, this activity is non-trivial and error-prone. To aid this endeavor, we developed a fully automated tool capable of gathering data about the intrinsic characteristics of vulnerabilities available in the National Vulnerability Database (NVD) and augmented it with information collected from multiple security feeds and social networks. Altogether, we collected data on more than 200,000 vulnerabilities that can be used for various research topics, e.g., analyzing the risk of exploiting security flaws, vulnerability severity prediction, etc. In this paper, we present a detailed description of the methodology used to create our dataset with its attributes. Additionally, we perform an exploratory analysis of the data gathered, and finally, we present an illustrative example of how analysts could use the data collected. The CVEjoin dataset and the scripts used for its construction are publicly available on GitHub.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    National vulnerability database website: https://nvd.nist.gov/.

  2. 2.

    The Mitre corporation website: https://cwe.mitre.org.

  3. 3.

    The Open Web Application Security Project website:https://owasp.org.

  4. 4.

    An archive of vulnerable software and exploits: https://www.exploit-db.com/.

  5. 5.

    Metric for estimating the probability of a vulnerability being exploited: https://www.first.org/epss/.

  6. 6.

    Microsoft security advisory: https://msrc.microsoft.com/update-guide/en-us.

  7. 7.

    Adobe security advisory: https://helpx.adobe.com/security.html.

  8. 8.

    Intel security advisory: https://www.intel.com/content/www/us/en/security-center/default.html.

  9. 9.

    Python package for working with URLs: https://docs.python.org/3/library/urllib.html.

  10. 10.

    Python library for scraping information from web pages: https://pypi.org/project/beautifulsoup4/.

  11. 11.

    Code developed to create the dataset: https://github.com/rodrigoparente/cvejoin-security-dataset.

  12. 12.

    Python package for data analysis and manipulation: https://pypi.org/project/pandas/.

  13. 13.

    News about Log4J vulnerability and how it was exploited: https://blog.qualys.com/vulnerabilities-threat-research/2021/12/10/apache-log4j2-zero-day-exploited-in-the-wild-log4shell.

  14. 14.

    News about CVE-2021-44142: https://www.helpnetsecurity.com/2022/02/02/samba-bug-may-allow-code-execution-as-root-on-linux-machines-nas-devices-cve-2021-44142/.

  15. 15.

    Vulnerability affecting SUSE OS: https://nvd.nist.gov/vuln/detail/CVE-2020-8025.

  16. 16.

    Vulnerability affecting a help desk tool: https://nvd.nist.gov/vuln/detail/CVE-2020-15849.

References

  1. NVD, NIST. (2022, November 18). NIST National Vulnerability Database. Retrieved November 18 2022. https://nvd.nist.gov/

  2. Furnell, S., Fischer, P., Finch, A.: Can’t get the staff? The growing need for cyber-security skills. Comput. Fraud Secur. 2017(2), 5–10 (2017)

    Article  Google Scholar 

  3. Forum of Incident Response and Security Teams (2019, June). CVSS v3.1 Specification Document [White paper]. Retrieved November 18 2022. https://www.first.org/cvss/v3.1/specification-document

  4. Dey, D., Lahiri, A., Zhang, G.: Optimal policies for security patch management. INFORMS J. Comput. 27(3), 462–477 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  5. Spring, J., Hatleback, E., Householder, A., Manion, A., Shick, D.: Time to Change the CVSS? IEEE Security Privacy 19(2), 74–78 (2021)

    Article  Google Scholar 

  6. Trifonov, R., Nakov, O., Mladenov, V.: Artificial Intelligence in Cyber Threats Intelligence. In: 2018 International Conference on Intelligent and Innovative Computing Applications (ICONIC), (pp. 1–4). IEEE (2018)

    Google Scholar 

  7. Elbaz, C., Rilling, L., Morin, C.: Automated Risk Analysis of a Vulnerability Disclosure Using Active Learning. In: Proceedings of the 28th Computer and Electronics Security Application Rendezvous (2021)

    Google Scholar 

  8. Bhandari, G., Naseer, A., Moonen, L.: CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software. In: Proceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering , pp. 30–39. Association for Computing Machinery (2021)

    Google Scholar 

  9. Fan, J., Li, Y., Wang, S., Nguyen, T.: A C/C++ Code Vulnerability Dataset with Code Changes and CVE Summaries. In: Proceedings of the 17th International Conference on Mining Software Repositories, pp. 508–512 Association for Computing Machinery (2020)

    Google Scholar 

  10. Jimenez, M., Le Traon, Y., Papadakis, M.: [Engineering Paper] Enabling the Continuous Analysis of Security Vulnerabilities with VulData7. In: 2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM), pp. 56–61. IEEE (2018)

    Google Scholar 

  11. Gkortzis, A., Mitropoulos, D., Spinellis, D.: VulinOSS: A Dataset of Security Vulnerabilities in Open-Source Systems. In: Proceedings of the 15th International Conference on Mining Software Repositories, pp. 18–21. Association for Computing Machinery (2018)

    Google Scholar 

  12. Alves, H., Fonseca, B., Antunes, N.: Software Metrics and Security Vulnerabilities: Dataset and Exploratory Study. In: 2016 12th European Dependable Computing Conference (EDCC), pp. 37–44. IEEE (2016)

    Google Scholar 

  13. Ponta, S., Plate, H., Sabetta, A., Bezzi, M.,Dangremont, C.: A manually-curated dataset of fixes to vulnerabilities of open-source software. In: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), pp. 383–387. IEEE (2019)

    Google Scholar 

  14. Foreman, P.: Vulnerability management. Auerbach Publications (2019)

    Google Scholar 

  15. Alexander, J.: Risk, threat, or vulnerability? what’s the difference. Retrieved November 18, 2022 (2021). https://www.kennasecurity.com/blog/risk-vs-threat-vs-vulnerability/

  16. Conti, M., Dargahi, T., Dehghantanha, A.: Cyber threat intelligence: challenges and opportunities. In: Dehghantanha, A., Conti, M., Dargahi, T. (eds.) Cyber Threat Intelligence. AIS, vol. 70, pp. 1–6. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73951-9_1

    Chapter  Google Scholar 

  17. Bromander, S.: Understanding Cyber Threat Intelligence: Towards Automation [Doctoral’s Thesis, University of Oslo] (2021). The University of Oslo Institutt for informatikk. https://www.duo.uio.no/handle/10852/84713

  18. Suciu, O., Nelson, C., Lyu, Z., Bao, T., Dumitras, T.: Expected exploitability: Predicting the development of functional vulnerability exploits. In: 31st USENIX Security Symposium (USENIX Security 22), pp. 377–394 (2022)

    Google Scholar 

  19. RecordFuture, Inc. (2021, February). Top Exploited Vulnerabilities in 2020 Affect Citrix, Microsoft Products [White paper]. Retrieved November 18 2022. https://go.recordedfuture.com/hubfs/reports/cta-2021-0209.pdf

Download references

Acknowledgment

The authors would like to thank CAPES for the financial support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francisco R. P. da Ponte .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

da Ponte, F.R.P., Rodrigues, E.B., Mattos, C.L.C. (2023). CVEjoin: An Information Security Vulnerability and Threat Intelligence Dataset. In: Barolli, L. (eds) Advanced Information Networking and Applications. AINA 2023. Lecture Notes in Networks and Systems, vol 661. Springer, Cham. https://doi.org/10.1007/978-3-031-29056-5_34

Download citation

Publish with us

Policies and ethics