Abstract
The growing volume of data, especially in cases of imbalanced datasets, has posed significant challenges in the classification process, particularly when it comes to identifying cyberattacks on industrial control systems (ICS) networks, which have been a source of concern due to the significant destructive impact of viruses such as Slammer, worms, Stuxnet, Duqu, Seismic Net, and Flame on critical infrastructures in various countries. The key challenge is constructing the intrusion detection system (IDS) framework to deal with imbalanced datasets. Many researchers work especially on binary classification, but multi-classification is a more challenging and still active research area. To deal with the multi-class imbalanced classification problem, we outline an instance-based intrusion detection technique named ICS-IDS, for intrusion detection in ICS systems specific to SCADA networks. The developed technique consists of two core components, the data preparation component, and the detection component. The data preparation component uses the normalization, Fisher Discriminant Analysis, and k-neighbor’s method to scale the data, reduce the dimensionality, and resample the dataset, respectively. To learn the latent representations and discern harmful vectors from attacked data, the detection/recognition component leverages an efficient instance-based learner. The proposed ICS-IDS model outperforms existing attractive methods in detecting sophisticated attack vectors in ICS data, achieving 99% accuracy and 99% detection rates (DR) on an industrial network dataset. This proves the methodology's practicality for implementing security in real-world ICS networks.
Similar content being viewed by others
Data availability
The corresponding author can provide the dataset used for the experiments in this study upon reasonable request.
References
Adepu S, Mathur A (2016) An investigation into the response of a water treatment system to cyber-attacks. In: Proceedings of 2016 IEEE 17th International Symposium on High Assurance Systems Engineering (HASE) Orlando, FL, USA, Jan 7–9, 2016, pp 141–148
Groover MP (2016) Automation, production systems, and computer-integrated manufacturing. Pearson, London
Kriaa S, Bouissou M, Colin F, Halgand Y, Pietre-Cambacedes L (2014) Safety and security interactions modeling using the BDMP formalism: case study of a pipeline. In: proceedings of 2014 International Conference on Computer Safety, Reliability, and Security, Delft, The Netherlands, 22–25 September 2014, pp 326–341
Wood AJ, Wollenberg BF (2012) Power generation, operation, and control. Wiley, Hoboken
Bhamare D, Zolanvari M, Erbad A, Jain R, Khan K, Meskin N (2020) Cybersecurity for industrial control systems: a survey. Comput Secur 89:101677
ICS-CERT Annual Vulnerability Coordination Report, Dept. Homeland Secur. Washington, DC, USA, 2016.
Langner R (2011) Stuxnet: dissecting a cyberwarfare weapon. IEEE Secur Priv 9(3):49–51
Genge B et al (2012) A cyber-physical experimentation environment for the security analysis of networked industrial control systems. Comput Electr Eng 38(5):1146–1161
Erol-Kantarci M, Mouftah HT (2013) Smart grid forensic science: applications, challenges, and open issues. IEEE Commun Mag 51(1):68–74
Nazir S, Patel S, Patel D (2018) Hyper parameters selection for image classification in convolutional neural networks. In: Proceedings of 2018 IEEE 17th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC). IEEE, Berkeley, CA, USA, pp 401–407
Cheung S, Dutertre B, Fong M, Lindqvist U, Skinner K, Valdes A (2007) Using model-based intrusion detection for SCADA networks. In: Proceedings of the SCADA Security Scientific Symposium, vol 46, pp 1–12
Friedberg I, Skopik F, Settanni G, Fiedler R (2015) Combating advanced persistent threats: from network event correlation to incident detection. Comput Secur 48:35–57
Fovino IN, Carcano A, De Lacheze Murel T, Trombetta A, Masera M (2010) Modbus/DNP3 state-based intrusion detection system. In: 2010 24th IEEE International Conference on Advanced Information Networking and Applications, pp 729–736
Yang Y, McLaughlin K, Littler T, Sezer S, Pranggono B, Wang HF (2013) Intrusion detection system for IEC 60870-5-104 based SCADA networks. In: Proceedings of the IEEE Power Energy Society General Meeting, pp 1–5
Kang B, McLaughlin K, Sezer S (2016) Towards a stateful analysis framework for smart grid network intrusion detection. In: Proceedings of the 4th International Symposium for ICS & SCADA Cyber Security Research, pp 1–8
Khan IA et al (2019) HML-IDS: a hybrid-multilevel anomaly prediction approach for intrusion detection in SCADA systems. IEEE Access 7:89507–89521
Morris TH, Thornton Z, Turnipseed I (2015) Industrial control system simulation and data logging for intrusion detection system research. In: Proceedings of the 7th Annual Southeastern Cyber Security Summit, pp 3–4
Stallings W (2017) Cryptography and network security: principles and practice. Pearson, Upper Saddle River
Bijone M (2016) A survey on secure network: intrusion detection & prevention approaches. Am J Inf Syst 4(3):69–88
Hodo E et al (2017) Shallow and deep networks intrusion detection system: a taxonomy and survey. arXiv preprint arXiv: 1701.02145
Kasongo SM, Sun Y (2019) A deep learning method with filter based feature engineering for wireless intrusion detection system. IEEE Access 7:38597–38607
Ahmad I et al (2018) Performance comparison of support vector machine, random forest, and extreme learning machine for intrusion detection. IEEE access 6:33789–33795
Yang X, Hui Z (2015) Intrusion detection alarm filtering technology based on ant colony clustering algorithm. In: Proceedings of 2015 Sixth International Conference on Intelligent Systems Design and Engineering Applications (ISDEA) IEEE. Guiyang, China, pp 470–473
El-halees AM (2015) Classifying multi-class imbalance data classifying multi-class imbalance data. no. September 2013
Soliman S, Oudah W, Aljuhani A (2023) Deep learning-based intrusion detection approach for securing industrial Internet of Things. Alex Eng J 81:371–383
Chawla NV et al (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Rao YN, Suresh Babu K (2023) An imbalanced generative adversarial network-based approach for network intrusion detection in an imbalanced dataset. Sensors 23(1):550
Seo JH, Kim YH (2018) Machine-learning approach to optimize smote ratio in class imbalance dataset for intrusion detection. Comput Intell Neurosci 2018:1–11
Jiang K, Lu J, Xia K (2016) A novel algorithm for imbalance data classification based on genetic algorithm improved SMOTE. Arab J Sci Eng 41(8):3255–3266
Liu J, Tang Y, Zhao H, Wang X, Li F, Zhang J (2023) CPS attack detection under limited local information in cyber security: an ensemble multi-node multi-class classification approach. ACM Trans Sens Netw
Estabrooks A, Jo T, Japkowicz N (2004) A multiple resampling method for learning from imbalanced data sets. Comput Intell 20(1):18–36
Wang BX, Japkowicz N (2004) Imbalanced data set learning with synthetic samples. In: Proceedings of the IRIS Machine Learning Workshop
Han H, Wang WY, Mao BH (2005) Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: International Conference on Intelligent Computing, Springer, Berlin, Heidelberg, pp 878–887
He H, Bai Y, Garcia EA, Li S (2008) ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp 1322–1328. IEEE.
Chawla NV, Lazarevic A, Hall LO, Bowyer KW (2003) SMOTEBoost: improving prediction of the minority class in boosting. In: European Conference on Principles of Data Mining and Knowledge Discovery, Springer, Berlin, Heidelberg, pp 107–119
Guo H, Viktor HL (2004) Learning from imbalanced data sets with boosting and data generation: the databoost-im approach. ACM SIGKDD Explor Newsl 6(1):30–39
Wang X (2018) Design of temporal sequence association rule-based intrusion detection behavior detection system for distributed network. Mod Electron Techn 41(3):108–114
Çavuşoğlu Ü (2019) A new hybrid approach for intrusion detection using machine learning methods. Appl Intell 49(7):2735–2761
Fuqun Z (2015) Detection method of LSSVM network intrusion based on hybrid kernel function. Mod Electron Tech 21:027
Schuster F, Paul A, Rietz R, König H (2015) Potentials of using one-class SVM for detecting protocol-specific anomalies in industrial networks. In: Proceedings of 2015 IEEE Symposium Series on Computational Intelligence, Cape Town, South Africa, pp 83–90
Maglaras LA, Jiang J (2014) A real time OCSVM intrusion detection module with low overhead for SCADA systems. Int J Adv Res Artif Intell (IJARAI) 3(10)
Khan IA, Pi D, Khan N, Khan ZU, Hussain Y, Nawaz A, Ali F (2021) A privacy-conserving framework based intrusion detection method for detecting and recognizing malicious behaviours in cyber-physical power networks. Appl Intell 1–16
Nazir S, Patel S, Patel D (2021) Autoencoder based anomaly detection for scada networks. Int J Artif Intell Mach Learn (IJAIML) 11(2):83–99
Nader P, Honeine P, Beauseroy P (2014) lp-norms in one-class classification for intrusion detection in SCADA systems. IEEE Trans Industr Inf 10(4):2308–2317
Beaver JM, Borges-Hink RC, Buckner MA (2013) An evaluation of machine learning methods to detect malicious SCADA communications. In: Proceedings of 2013 12th International Conference on Machine Learning and Applications, Miami, FL, USA, No 2, pp 54–59
Mansouri A, Majidi B, Shamisa A (2017) Anomaly detection in industrial control systems using evolutionary-based optimization of neural networks. Commun Adv Comput Sci Appl 1:49–55
Shirazi SN, Gouglidis A, Syeda KN, Simpson S, Mauthe A, Stephanakis IM, Hutchison D (2016) Evaluation of anomaly detection techniques for scada communication resilience. In: Proceedings of 2016 Resilience Week (RWS), Chicago, IL, USA, pp 140–145
Al Shalabi L, Shaaban Z, Kasasbeh B (2006) Data mining: a preprocessing engine. J Comput Sci 2(9):735–739
Patel VR, Mehta RG (2011) Impact of outlier removal and normalization approach in modified k-means clustering algorithm. Int J Comput Sci Issues (IJCSI) 8(5):331
Akbani R, Kwek S, Japkowicz N (2004) Applying support vector machines to imbalanced datasets. In: Proceedings of the European Conference on Machine Learning, Springer, Berlin, Germany, pp 39–50
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16(1):321–357
Tomek I (1976) Two modifications of CNN. IEEE Trans Syst Man Cybern 6(11):769–772
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Franklin J (2005) The elements of statistical learning: data mining, inference and prediction. Math Intell 27(2):83–85
Leo B (2001) Random forests. Mach Learn 45(1):5–32
Rumelhart D, Hinton G, Williams R (1986) Learning internal representations by error propagation. In: Rumelhart DE, McClelland JL (eds) Parallel distributed processing, vol 1. MIT Press, Cambridge
Chung J et al (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555
Wang Y (2017) A new concept using LSTM neural networks for dynamic system identification. In: Proceedings of 2017 American Control Conference (ACC). IEEE, Seattle, WA, USA, pp 5324–5329
Feng C, Li T, Chana D (2017) Multi-level anomaly detection in industrial control systems vi package signatures and LSTM networks. In: Proceedings of the 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp 261–272
Mansouri A, Majidi B, Shamisa A (2021) Metaheuristic neural networks for anomaly recognition in industrial sensor networks with packet latency and jitter for smart infrastructures. Int J Comput Appl 43(3):257–266
Brand J, Balvanz J (2005) Automation is a breeze with autoit. In: Proceedings of the 33rd annual ACM SIGUCCS conference on User services, pp 12–15
Sokolova M, Japkowicz N, Szpakowicz S (2006) Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. In: Proceedings of the Australasian Joint Conference on Artificial Intelligence, Springer, Berlin, Germany, pp 1015–1021
Demertzis K, Iliadis L, Anezakis V-D (2018) MOLESTRA: a multi-task learning approach for real-time big data analytics. In: Proceedings of the IEEE Innovations in Intelligent Systems and Applications (INISTA), pp 1–8
Díaz-Vico D, Dorronsoro JR (2019) Deep least squares fisher discriminant analysis. IEEE Trans Neural Netw Learn Syst 31(8):2752–2763
Sun P, Liu P, Li Q, Liu C, Lu X, Hao R, Chen J (2020) DL-IDS: extracting features using CNN-LSTM hybrid network for intrusion detection system. Secur Commun Netw 2020:1–11
Funding
This research received no external funding.
Author information
Authors and Affiliations
Contributions
BSA contributed to conceptualization, methodology, software, writing—original draft, and software. IU contributed to conceptualization, methodology, and software; TAS contributed to conceptualization and methodology; IAK contributed to software; IK contributed to conceptualization and methodology; YYG contributed to validation, resources, writing—review, and editing; IU contributed to supervision and project administration; KO contributed to funding acquisition; H.H. contributed to conceptualization and methodology AA contributed to validation, resources, writing—review, and editing; RN contributed to supervision, project administration, and funding acquisition.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ali, B.S., Ullah, I., Al Shloul, T. et al. ICS-IDS: application of big data analysis in AI-based intrusion detection systems to identify cyberattacks in ICS networks. J Supercomput 80, 7876–7905 (2024). https://doi.org/10.1007/s11227-023-05764-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05764-5