Abstract
Data from Online Social Networks, search engines, and the World Wide Web are forms of unstructured knowledge that are not regularly used in cybersecurity systems. The main reason for the reluctance to utilize them is the difficulty to process them effectively and extract valuable information. In this paper, we present the Systemic Analyzer In Network Threats (SAINT) Observatory Subsystem or SAINToS for short, a novel platform for the acquisition and analysis of Open-Source Intelligence feeds. The proposed framework integrates different information pools to create a supplementary view of the evolving cybercriminal activity. The aim of SAINToS, is to provide additional models, methodologies, and mechanisms to enrich existing cybersecurity analysis. As a significant amount of related information is not standardized in the form of structured data tables or machine-processable formats (e.g., XML or JSON), secondary data sources, such as social networks and blogs, are expected to expand the scope and effectiveness of existing approaches. The emphasis of this work, is placed on the harmonization and visualization of data from different sources. As a result, these sources can be better understood and reused. In addition, the SAINToS, besides its standalone functionality and capabilities, can provide input, in standard formats, to additional major threat intelligence platforms.
Similar content being viewed by others
Data Availability Statement
The datasets generated and analyzed during the current study are not publicly available due to the uncertainty with respect to the operation of the server (i.e., no permanent link can be provided) that demonstrates the use of the system but are available from the corresponding author on reasonable request.
Notes
The source code and technical details about the implementation of SAINToS can be found at https://github.com/tzamalisp/saint-open-source-tool-for-cyberthreats-monitoring.
The SAINToS part of the SAINT EU research project. More information is available at https://project-saint.eu.
The Global Security Map can be found at https://globalsecuritymap.com
References
Anderson, R., Barton, C., Böhme, R., Clayton, R., van Eeten, M.J.G., Levi, M., Moore, T., Savage, S.: Measuring the cost of cybercrime, pp. 265–300. Springer, Berlin, Heidelberg (2013)
Acohido, B., Swartz, J.: Zero day threat. Sterling Publishing, New York (2008)
Armin, J., Foti, P., Cremonini, M.: 0-day vulnerabilities and cybercrime. In: Proceedings of the 2015 10th International Conference on Availability, Reliability and Security, ARES ’15, Washington, DC, USA, IEEE Computer Society, 711–718 2015
Jungherr, A., Jürgens, P.: Forecasting the pulse: How deviations from regular patterns in online data can identify offline phenomena. Internet Res. 23(5), 589–607 (2013)
Ritterman, J., Osborne, M., Klein, E.: Using prediction markets and twitter to predict a swine flu pandemic. In: Proceedings of the 1st International Workshop on Mining Social, 2009
Signorini, A., Segre, A.M., Polgreen, P.M.: The use of twitter to track levels of disease activity and public concern in the u.s. during the influenza a h1n1 pandemic. Plos One 6(5), 1–10 (2011)
Santillana, M., Nguyen, A.T., Dredze, M., Paul, M.J., Nsoesie, E.O., Brownstein, J.S.: Combining search, social media, and traditional data sources to improve influenza surveillance. PLOS Comput. Biol. 11(10), 1–15 (2015)
Nagar, R., Yuan, Q., Freifeld, C.C., Santillana, M., Nojima, A., Chunara, R., Brownstein, J.S.: A case study of the new york city 2012-2013 influenza season with daily geocoded twitter data from temporal and spatiotemporal perspectives. J. Med. Internet Res. 16(10), e236 (2014)
Tzamalis, P., Vikatos, P., Nikoletseas, S.: A hybridization of mobile crowdsensing, twitter analytics, and sensor data for the holistic approach of pollen onsets detection. In: 2019 15th International Conference on Distributed Computing in Sensor Systems (DCOSS). IEEE, 188–191 (2019)
Priedhorsky, R., Osthus, D., Daughton, A.R., Moran, K.R., Generous, N., Fairchild, G., Deshpande, A., Del Valle, S.Y.: Measuring global disease with wikipedia: success, failure, and a research agenda. In: Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing, CSCW ’17, New York, NY, USA. ACM, 1812–1834 2017
Rao, T., Srivastava, S.: Analyzing stock market movements using twitter sentiment analysis. In: roceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012), ASONAM ’12, Washington, DC, USA. IEEE Computer Society, 119–123 2012
Mao, Y., Wei, W., Wang, B., Liu, B.: Correlating s &p 500 stocks with twitter data. In: Proceedings of the First ACM International Workshop on Hot Topics on Interdisciplinary Social Networks Research, HotSocial ’12, New York, NY, USA, ACM, 69–72 2012
Oliveira, N., Cortez, P., Areal, N.: On the predictability of stock market behavior using stocktwits sentiment and posting volume. In: Luís, C., Luís, P.R., José, C. (eds.) Progress in artificial intelligence, pp. 355–365. Springer, Berlin, Heidelberg (2013)
Kollias, S., Vlachos, V., Papanikolaou, A., Assimakopoulos, V.: Adapting econometric models, technical analysis and correlation data to computer security data. In: 2011 First SysSec Workshop, 59–62, (2011)
Schechter, S.: Towards econometric models of the security risk from remote attacks. IEEE Secur. Priv. 3(1), 40–44 (2005)
Tumasjan, A., Sprenger, T., Sandner, P., Welpe, I.: Predicting elections with twitter: What 140 characters reveal about political sentiment. In: Proceedings of the International AAAI Conference on Web and Social Media, 4(1), 2010
Kagan, V., Stevens, A., Subrahmanian, V.S.: Using twitter sentiment to forecast the 2013 pakistani election and the 2014 indian election. IEEE Intell. Syst. 30(1), 2–5 (2015)
Ahmed, S., Skoric, M.: Twitter and 2013 Pakistan general election: the case of david 2.0 against goliaths, pp. 139–161. Springer, Berlin (2015)
Choy, M., Cheong, M.L.F, Laik, M.N., Shung, K.P.: A sentiment analysis of singapore presidential election 2011 using twitter data with census correction. CoRR, abs/1108.5520, (2011)
Choy, M., Cheong, M.L.F., Laik, M.N., Shung, K.P.: US presidential election 2012 prediction using census corrected twitter model. CoRR, abs/1211.0938, (2012)
Bovet, A., Morone, F., Makse, H.: Predicting election trends with twitter: Hillary clinton versus donald trump. CoRR, abs/1610.01587, (2016)
Burnap, P., Gibson, R., Sloan, L., Southern, R., Williams, M.: 140 characters to victory?: using twitter to predict the UK 2015 general election. CoRR, abs/1505.01511, (2015)
Gayo-Avello, D.: “I wanted to predict elections with twitter and all i got was this lousy paper” - a balanced survey on election prediction using twitter data. CoRR, abs/1204.6441, (2012)
Wang, X., Gerber, M.S., Brown, D.E.: Automatic crime prediction using events extracted from twitter posts. In: Shanchieh, J.Y., Ariel, M.G., Mica, E. (eds.) Social Computing Behavioral - Cultural Modeling and Prediction, pp. 231–238. Springer, Berlin (2012)
Bendler, J., Ratku, A., Neumann, D.: Crime mapping through geo-spatial social media activity. In: Proceedings of the International Conference on Information Systems - Building a Better World through Information Systems, ICIS 2014, Auckland, New Zealand, December 14–17 2014
Vomfell, L., Härdle, W.K., Lessmann, S.: Improving crime count forecasts using twitter and taxi data. Decis. Support Syst. 113, 73–85 (2018)
Zhongqing, W., Yue, Z.: DDoS event forecasting using twitter data. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI-17), Melbourne, Australia, 4151–4157 2017
Sabottke, C., Suciu, O., Dumitraş, T.: Vulnerability disclosure in the age of social media: exploiting twitter for predicting real-world exploits. SEC’15, USA. USENIX Association, 1041–1056 2015
Alves, F., Andongabo, A., Gashi, I., Ferreira, P.M., Bessani, A.: Follow the blue bird: A study on threat data published on twitter. In: Chen, L., Li, N., Liang, K., Schneider, S. (eds.) Computer security - ESORICS 2020, pp. 217–236. Springer, Cham (2020)
Le Sceller, Q., Karbab, E.B., Debbabi, M., Iqbal, F.: Sonar: automatic detection of cyber security events over the twitter stream. ARES ’17, New York, NY, USA, Association for Computing Machinery (2017)
Khandpur, R.P., Ji, T., Jan, S., Wang, G., Lu, C.-T., Ramakrishnan, N.: Crowdsourcing cybersecurity: cyber attack detection using social media. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM ’17, New York, NY, USA, 2017. Association for Computing Machinery, 1049–1057 2017
Alves, F., Bettini, A., Ferreira, P.M., Bessani, A.: Processing tweets for cybersecurity threat awareness. Inf. Syst. 95, 101586 (2021)
Liao, X., Yuan, K., Wang, X., Li, Z., Xing, L., Beyah, R.: Acing the ioc game: toward automatic discovery and analysis of open-source cyber threat intelligence. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, CCS ’16, New York, NY, USA, Association for Computing Machinery, 755–766 2016
Douligeris, C., Raghimi, O., Lourenço, M.B., Marinos, L., Sfakianakis, A., Doerr, C., Armin, J., Riccardi, M., Wim, M., Thaker, N., Stirparo, P., Samwel, P., Paganini, P., Adachi, S., Lingris, S., Hemker, T.: List of top15 threats. ENISA threat landscape. techreport, ENISA, (2020)
Egelman, S., Herley, C., van Oorschot, P.C.: Markets for zero-day exploits: Ethics and implications. In: Proceedings of the 2013 New Security Paradigms Workshop, NSPW ’13, New York, NY, USA, ACM, 41–46 2013
Zhao, M., Laszka, A., Maillart, T., Grossklags, J.: Crowdsourced security vulnerability discovery: Modeling and organizing bug-bounty programs. In: HCOMP Workshop on Mathematical Foundations of Human Computation, November (2016)
Peeters, G.: Strengthening the digital achilles heel of the european union: Make use of ethical hackers to find vulnerabilities in information systems? Master’s thesis, Leiden University, (2017)
Daley, D., Gani, J.: Epidemic modelling. Cambridge University Press, Cambridge (1999)
ACHE documentation. Current on-line (December 2020): https://ache.readthedocs.io/en/latest/, (2020)
Stupples, D.: Security challenge of TOR and the deep web. In: 8th International Conference for Internet Technology and Secured Transactions, ICITST 2013, London, United Kingdom, December 9-12, 2013. IEEE, 14 2013
Miller, C.: The legitimate vulnerability market: Inside the secretive world of 0-day exploit sales. In: In Sixth Workshop on the Economics of Information Security, (2007)
Christopher, D.: Manning, Prabhakar Raghavan, and Hinrich Schütze. Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Joshi, P.M., Liu, S.: Web document text and images extraction using dom analysis and natural language processing. In: Proceedings of the 9th ACM symposium on Document engineering, 218–221 2009
Radianti, J.: A study of a social behavior inside the online black markets. In: 2010 Fourth International Conference on Emerging Security Information, Systems and Technologies, 189–194, 2010
Hardy, R.A., Norgaard, J.R.: Reputation in the internet black market: an empirical and theoretical analysis of the deep web. J. Inst. Econ. 12(3), 515–539 (2016)
Finklea, K.M.: Dark web. Congressional Research Service, (2015)
Durrett, G., Klein, D.: A joint model for entity analysis: coreference, typing, and linking. Trans. Assoc. Comput. Linguist. 2, 477–490 (2014)
Shumway, R.H., Stoffer, D.S.: Time series analysis and its applications: with R examples. Springer, Berlin (2017)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Vlachos, V., Stamatiou, Y.C., Tzamalis, P. et al. The SAINT observatory subsystem: an open-source intelligence tool for uncovering cybersecurity threats. Int. J. Inf. Secur. 21, 1091–1106 (2022). https://doi.org/10.1007/s10207-022-00599-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10207-022-00599-2