CVEjoin: An Information Security Vulnerability and Threat Intelligence Dataset

Francisco R. P. da Ponte¹⁰,
Emanuel B. Rodrigues¹⁰ &
César L. C. Mattos¹⁰

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 661))

Included in the following conference series:

International Conference on Advanced Information Networking and Applications

644 Accesses

Abstract

The risk of exploiting information security vulnerabilities should not be determined solely by a single metric, such as the Common Vulnerability Scoring System (CVSS). This approach disregards the global threat landscape and the vulnerable asset. Therefore, in addition to using traditional Vulnerability Management (VM) tools, analysts and researchers must manually curate datasets containing threat intelligence and context-specific information about security flaws. However, this activity is non-trivial and error-prone. To aid this endeavor, we developed a fully automated tool capable of gathering data about the intrinsic characteristics of vulnerabilities available in the National Vulnerability Database (NVD) and augmented it with information collected from multiple security feeds and social networks. Altogether, we collected data on more than 200,000 vulnerabilities that can be used for various research topics, e.g., analyzing the risk of exploiting security flaws, vulnerability severity prediction, etc. In this paper, we present a detailed description of the methodology used to create our dataset with its attributes. Additionally, we perform an exploratory analysis of the data gathered, and finally, we present an illustrative example of how analysts could use the data collected. The CVEjoin dataset and the scripts used for its construction are publicly available on GitHub.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

A comprehensive analysis on software vulnerability detection datasets: trends, challenges, and road ahead

Article Open access 23 July 2024

AutoCVSS: An Approach for Automatic Assessment of Vulnerability Severity Based on Attack Process

Vulnerability Analysis Using Google and Shodan

Notes

1.
National vulnerability database website: https://nvd.nist.gov/.
2.
The Mitre corporation website: https://cwe.mitre.org.
3.
The Open Web Application Security Project website:https://owasp.org.
4.
An archive of vulnerable software and exploits: https://www.exploit-db.com/.
5.
Metric for estimating the probability of a vulnerability being exploited: https://www.first.org/epss/.
6.
Microsoft security advisory: https://msrc.microsoft.com/update-guide/en-us.
7.
Adobe security advisory: https://helpx.adobe.com/security.html.
8.
Intel security advisory: https://www.intel.com/content/www/us/en/security-center/default.html.
9.
Python package for working with URLs: https://docs.python.org/3/library/urllib.html.
10.
Python library for scraping information from web pages: https://pypi.org/project/beautifulsoup4/.
11.
Code developed to create the dataset: https://github.com/rodrigoparente/cvejoin-security-dataset.
12.
Python package for data analysis and manipulation: https://pypi.org/project/pandas/.
13.
News about Log4J vulnerability and how it was exploited: https://blog.qualys.com/vulnerabilities-threat-research/2021/12/10/apache-log4j2-zero-day-exploited-in-the-wild-log4shell.
14.
News about CVE-2021-44142: https://www.helpnetsecurity.com/2022/02/02/samba-bug-may-allow-code-execution-as-root-on-linux-machines-nas-devices-cve-2021-44142/.
15.
Vulnerability affecting SUSE OS: https://nvd.nist.gov/vuln/detail/CVE-2020-8025.
16.
Vulnerability affecting a help desk tool: https://nvd.nist.gov/vuln/detail/CVE-2020-15849.

References

NVD, NIST. (2022, November 18). NIST National Vulnerability Database. Retrieved November 18 2022. https://nvd.nist.gov/
Furnell, S., Fischer, P., Finch, A.: Can’t get the staff? The growing need for cyber-security skills. Comput. Fraud Secur. 2017(2), 5–10 (2017)
Article Google Scholar
Forum of Incident Response and Security Teams (2019, June). CVSS v3.1 Specification Document [White paper]. Retrieved November 18 2022. https://www.first.org/cvss/v3.1/specification-document
Dey, D., Lahiri, A., Zhang, G.: Optimal policies for security patch management. INFORMS J. Comput. 27(3), 462–477 (2015)
Article MathSciNet MATH Google Scholar
Spring, J., Hatleback, E., Householder, A., Manion, A., Shick, D.: Time to Change the CVSS? IEEE Security Privacy 19(2), 74–78 (2021)
Article Google Scholar
Trifonov, R., Nakov, O., Mladenov, V.: Artificial Intelligence in Cyber Threats Intelligence. In: 2018 International Conference on Intelligent and Innovative Computing Applications (ICONIC), (pp. 1–4). IEEE (2018)
Google Scholar
Elbaz, C., Rilling, L., Morin, C.: Automated Risk Analysis of a Vulnerability Disclosure Using Active Learning. In: Proceedings of the 28th Computer and Electronics Security Application Rendezvous (2021)
Google Scholar
Bhandari, G., Naseer, A., Moonen, L.: CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software. In: Proceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering , pp. 30–39. Association for Computing Machinery (2021)
Google Scholar
Fan, J., Li, Y., Wang, S., Nguyen, T.: A C/C++ Code Vulnerability Dataset with Code Changes and CVE Summaries. In: Proceedings of the 17th International Conference on Mining Software Repositories, pp. 508–512 Association for Computing Machinery (2020)
Google Scholar
Jimenez, M., Le Traon, Y., Papadakis, M.: [Engineering Paper] Enabling the Continuous Analysis of Security Vulnerabilities with VulData7. In: 2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM), pp. 56–61. IEEE (2018)
Google Scholar
Gkortzis, A., Mitropoulos, D., Spinellis, D.: VulinOSS: A Dataset of Security Vulnerabilities in Open-Source Systems. In: Proceedings of the 15th International Conference on Mining Software Repositories, pp. 18–21. Association for Computing Machinery (2018)
Google Scholar
Alves, H., Fonseca, B., Antunes, N.: Software Metrics and Security Vulnerabilities: Dataset and Exploratory Study. In: 2016 12th European Dependable Computing Conference (EDCC), pp. 37–44. IEEE (2016)
Google Scholar
Ponta, S., Plate, H., Sabetta, A., Bezzi, M.,Dangremont, C.: A manually-curated dataset of fixes to vulnerabilities of open-source software. In: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), pp. 383–387. IEEE (2019)
Google Scholar
Foreman, P.: Vulnerability management. Auerbach Publications (2019)
Google Scholar
Alexander, J.: Risk, threat, or vulnerability? what’s the difference. Retrieved November 18, 2022 (2021). https://www.kennasecurity.com/blog/risk-vs-threat-vs-vulnerability/
Conti, M., Dargahi, T., Dehghantanha, A.: Cyber threat intelligence: challenges and opportunities. In: Dehghantanha, A., Conti, M., Dargahi, T. (eds.) Cyber Threat Intelligence. AIS, vol. 70, pp. 1–6. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73951-9_1
Chapter Google Scholar
Bromander, S.: Understanding Cyber Threat Intelligence: Towards Automation [Doctoral’s Thesis, University of Oslo] (2021). The University of Oslo Institutt for informatikk. https://www.duo.uio.no/handle/10852/84713
Suciu, O., Nelson, C., Lyu, Z., Bao, T., Dumitras, T.: Expected exploitability: Predicting the development of functional vulnerability exploits. In: 31st USENIX Security Symposium (USENIX Security 22), pp. 377–394 (2022)
Google Scholar
RecordFuture, Inc. (2021, February). Top Exploited Vulnerabilities in 2020 Affect Citrix, Microsoft Products [White paper]. Retrieved November 18 2022. https://go.recordedfuture.com/hubfs/reports/cta-2021-0209.pdf

Download references

Acknowledgment

The authors would like to thank CAPES for the financial support.

Author information

Authors and Affiliations

Federal University of Ceará (UFC), Fortaleza, Brazil
Francisco R. P. da Ponte, Emanuel B. Rodrigues & César L. C. Mattos

Authors

Francisco R. P. da Ponte
View author publications
You can also search for this author in PubMed Google Scholar
Emanuel B. Rodrigues
View author publications
You can also search for this author in PubMed Google Scholar
César L. C. Mattos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francisco R. P. da Ponte .

Editor information

Editors and Affiliations

Department of Information and Communication Engineering, Faculty of Information Engineering, Fukuoka Institute of Technology, Fukuoka, Japan
Leonard Barolli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

da Ponte, F.R.P., Rodrigues, E.B., Mattos, C.L.C. (2023). CVEjoin: An Information Security Vulnerability and Threat Intelligence Dataset. In: Barolli, L. (eds) Advanced Information Networking and Applications. AINA 2023. Lecture Notes in Networks and Systems, vol 661. Springer, Cham. https://doi.org/10.1007/978-3-031-29056-5_34

Download citation

DOI: https://doi.org/10.1007/978-3-031-29056-5_34
Published: 20 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-29055-8
Online ISBN: 978-3-031-29056-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

CVEjoin: An Information Security Vulnerability and Threat Intelligence Dataset

Abstract

Access this chapter

Similar content being viewed by others

A comprehensive analysis on software vulnerability detection datasets: trends, challenges, and road ahead

AutoCVSS: An Approach for Automatic Assessment of Vulnerability Severity Based on Attack Process

Vulnerability Analysis Using Google and Shodan

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

CVEjoin: An Information Security Vulnerability and Threat Intelligence Dataset

Abstract

Access this chapter

Similar content being viewed by others

A comprehensive analysis on software vulnerability detection datasets: trends, challenges, and road ahead

AutoCVSS: An Approach for Automatic Assessment of Vulnerability Severity Based on Attack Process

Vulnerability Analysis Using Google and Shodan

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation