Nothing Special   »   [go: up one dir, main page]

skip to main content
Open access

Meta-Learning to Improve Unsupervised Intrusion Detection in Cyber-Physical Systems

Published: 22 September 2021 Publication History


Artificial Intelligence (AI)-based classifiers rely on Machine Learning (ML) algorithms to provide functionalities that system architects are often willing to integrate into critical Cyber-Physical Systems (CPSs). However, such algorithms may misclassify observations, with potential detrimental effects on the system itself or on the health of people and of the environment. In addition, CPSs may be subject to threats that were not previously known, motivating the need for building Intrusion Detectors (IDs) that can effectively deal with zero-day attacks. Different studies were directed to compare misclassifications of various algorithms to identify the most suitable one for a given system. Unfortunately, even the most suitable algorithm may still show an unsatisfactory number of misclassifications when system requirements are strict. A possible solution may rely on the adoption of meta-learners, which build ensembles of base-learners to reduce misclassifications and that are widely used for supervised learning. Meta-learners have the potential to reduce misclassifications with respect to non-meta learners: however, misleading base-learners may let the meta-learner leaning towards misclassifications and therefore their behavior needs to be carefully assessed through empirical evaluation. To such extent, in this paper we investigate, expand, empirically evaluate, and discuss meta-learning approaches that rely on ensembles of unsupervised algorithms to detect (zero-day) intrusions in CPSs. Our experimental comparison is conducted by means of public datasets belonging to network intrusion detection and biometric authentication systems, which are common IDSs for CPSs. Overall, we selected 21 datasets, 15 unsupervised algorithms and 9 different meta-learning approaches. Results allow discussing the applicability and suitability of meta-learning for unsupervised anomaly detection, comparing metric scores achieved by base algorithms and meta-learners. Analyses and discussion end up showing how the adoption of meta-learners significantly reduces misclassifications when detecting (zero-day) intrusions in CPSs.


V. Chandola, A. Banerjee, V. Kumar. 2009. Anomaly detection: A survey. ACM Computing Surveys (CSUR) 41, 3 (2009), 15.
M. Goldstein and S. Uchida. 2016. A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PloS one 11, 4 (2016), e0152173.
K. Leung and C. Leckie. 2005. Unsupervised anomaly detection in network intrusion detection using clusters. In Proceedings of the Twenty-eighth Australasian conference on Computer Science-Volume 38. Australian Computer Society, Inc., 333–342.
T. Zoppi, A. Ceccarelli, T. Capecchi, and A. Bondavalli. 2021. Unsupervised anomaly detectors to detect intrusions in the current threat landscape. ACM/IMS Transactions on Data Science 2, 2 (2021), 1–26.
A. Lazarevic, L. Ertoz, V. Kumar, A. Ozgur, and J. Srivastava. 2003. A comparative study of anomaly detection schemes in network intrusion detection. In Proceedings of the 2003 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, 25–36.
L. D'hooge, T. Wauters, B. Volckaert, and F. De Turck. 2019. In-depth comparative evaluation of supervised machine learning approaches for detection of cybersecurity threats. In Proc. 4th Int. Conf. Internet Things, Big Data Secur. (IoTBDS) 1, (2019), 125–136.
P. S. Kenkre, A. Pai, and L. Colaco. 2015. Real time intrusion detection and prevention system. In Proceedings of the 3rd International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2014. Springer, Cham, 405–411.
C. Kruegel and T. Toth. 2003. Using decision trees to improve signature-based intrusion detection. In International Workshop on Recent Advances in Intrusion Detection. Springer, Berlin, 173–191.
T. Zoppi, A. Ceccarelli, and A. Bondavalli. 2016. Context-awareness to improve anomaly detection in dynamic service oriented architectures. In International Conference on Computer Safety, Reliability, and Security. Springer, Cham, 145–158.
L. Bilge and T. Dumitraş. 2012. Before we knew it: An empirical study of zero-day attacks in the real world. In Proceedings of the 2012 ACM Conference on Computer and Communications Security. ACM, 833–844.
Casas Pedro, Johan Mazel, and Philippe Owezarski. 2012. Unsupervised network intrusion detection systems: Detecting the unknown without knowledge. Computer Communications 35, 7 (2012), 772–783.
Committee on National Security Systems - CNSSI No. 4009 “Committee on National Security Systems (CNSS) Glossary”, April 2015.
A. Avizienis, J. C. Laprie, B. Randell, and C. Landwehr. 2004. Basic concepts and taxonomy of dependable and secure computing. IEEE Transactions on Dependable and Secure Computing 1, 1 (2004), 11–33.
L. Breiman. 2001. Random forests. Machine Learning 45, 1 (2001), 5–32.
Y. Freund and R. E. Schapire. 1996. Experiments with a new boosting algorithm. In icml 96, (1996), 148–156.
L. Breiman. 1996. Bagging predictors. Machine Learning 26, 2 (1996), 123–140.
R. E. Schapire. 1990. The strength of weak learnability. Machine Learning 5, (1990), 197–227.
Wolpert David. 1992. Stacked generalization. Neural Networks 5 (1992), 241–259. 10.1016/S0893-6080(05)80023-1
E. Alpaydin and C. Kaynak. 1998. Cascading classifiers. Kybernetika 34 (1998), 369–374.
J. Gama and Brazdil P. Cascade generalization. Machine Learning 41, 3 (2000), 315–343.
C. Ferri, P. Flach, and J. Hernandez-Orallo. 2004. Delegating classifiers. In Proceedings of the Twenty-first International Conference on Machine Learning, (ICML'04), 289–296.
Ortega Julio, Koppel Moshe, and Argamon Shlomo. 2001. Arbitrating among competing classifiers using learned referees. Knowledge and Information Systems 3 (2001), 470–490. 10.1007/PL00011679
P. Chan and S. Stolfo. 1993. Toward parallel and distributed learning by metalearning. In Working Notes of the AAAI-93 Workshop on Knowledge Discovery in Databases. 227–240.
Lemke Christiane, Budka Marcin, and Gabrys Bogdan. 2013. Metalearning: A survey of trends and technologies. Artificial Intelligence Review.
P. Brazdil, C. Giraud-Carrier, C. Soares, and R. Vilalta. 2009. Metalearning: Applications to data mining. Springer, Berlin.
J. Vanschoren. 2010. Understanding machine learning performance with experiment databases. PhD thesis, Arenberg Doctoral School of Science, Engineering & Technology, Katholieke Universiteit Leuven.
M. Gharib and A. Bondavalli. 2019. On the evaluation measures for machine learning algorithms for safety-critical systems. In 2019 15th European Dependable Computing Conference (EDCC). IEEE, 141–144.
Check Point Research. 2019. Cyber Attack Trend: 2019 Mid-Year Report, vol. 1, 2019.
ENISA. 2018. Threat Landscape Report 7, 2018.
Chen Liming and Algirdas Avizienis. 1978. N-version programming: A fault-tolerance approach to reliability of software operation. Proc. 8th IEEE Int. Symp. on Fault-Tolerant Computing (FTCS-8). 1. 1978.
Nordmann Lars and Hoang Pham. 2018. Weighted voting systems. IEEE Transactions on Reliability 48, 1 (1999), 42–49.
ENISA. 2018. Threat landscape report 7, 2018.
M. Ring, S. Wunderlich, D. Scheuring, D. Landes, and A. Hotho. 2019. A survey of network-based intrusion detection data sets. Computers & Security.
Ali Shiravi, Hadi Shiravi, Mahbod Tavallaee, and Ali A Ghorbani. 2012. Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Computers & Security 31, 3 (2012), 357–374.
Mahbod Tavallaee, Ebrahim Bagheri, Wei Lu, and Ali A Ghorbani. 2009. A detailed analysis of the KDD CUP 99 data set. In Computational Intelligence for Security and Defense Applications, 2009. CISDA 2009. IEEE Symposium on. IEEE, 1–6.
M. Ring, S. Wunderlich, D. Grüdl, D. Landes, and A. Hotho. 2017. Flow-based benchmark data sets for intrusion detection. In Proceedings of the 16th European Conference on Cyber Warfare and Security. ACPI, 361–369.
Nour Moustafa and Jill Slay. 2015. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Military Communications and Information Systems Conference (MilCIS) 2015. IEEE, 1–6.
I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani. 2018. Toward generating a new intrusion detection dataset and intrusion traffic characterization. In ICISSP, 108–116.
W. Haider, J. Hu, J. Slay, B. P. Turnbull, and Y. Xie. 2017. Generating realistic intrusion detection system dataset based on fuzzy qualitative modeling. Journal of Network and Computer Applications 87 (2017), 185–192.
S. Garcia, M. Grill, J. Stiborek, and A. Zunino. 2014. An empirical comparison of botnet detection methods. Computers & Security 45, (2014), 100–123.
A. H. Lashkari, A. F. A. Kadir, L. Taheri, and A. A. Ghorbani. 2018. Toward developing a systematic approach to generate benchmark android malware datasets and classification. In 2018 International Carnahan Conference on Security Technology (ICCST). IEEE, 1–7.
G. Maciá-Fernández, J. Camacho, R. Magán-Carrión, P. García-Teodoro, and R. Theron. 2018. UGR ‘16: A new dataset for the evaluation of cyclostationarity-based network IDSs. Computers & Security 73 (2018), 411–424.
BIT – Biometrics Ideal Test, CASIA-FingerprintV5. Retrieved on 15 December, 2020 from
MathWorks - FingerPrint Matching: A simple approach, (online), accessed: 2019-11-20
Warwick R. Adams. 2017. High-accuracy detection of early parkinson's disease using multiple characteristics of finger movement while typing. PloS one 12, 11 (2017), e0188226.
S. Koldijk, M. Sappelli, S. Verberne, M. Neerincx, and W. Kraaij. 2014. The SWELL knowledge work dataset for stress and user modeling research. To appear in: Proceedings of the 16th ACM International Conference on Multimodal Interaction (ICMI 2014) (Istanbul, Turkey, 12–16 November 2014)
Philip Schmidt, Attila Reiss, Robert Duerichen, Claus Marberger, Kristof Van Laerhoven. 2018. Introducing WESAD, a multimodal dataset for wearable stress and affect detection. ICMI 2018, Boulder, USA, 2018
A. Memo, L. Minto, and P. Zanuttigh. 2015. Exploiting Silhouette Descriptors and Synthetic Data for Hand Gesture Recognition. STAG: Smart Tools & Apps for Graphics, 2015
A. Vajdi, M. R. Zaghian, S. Farahmand, E. Rastegar, K. Maroofi, S. Jia, and A. Bayat. 2019. Human gait database for normal walk collected by smart phone accelerometer. arXiv preprint arXiv:1905.03109.
Kaggle - Voice Recognition, Jeganathan Kolappan. (online), accessed: 2019-11-20
Kaggle - Face Images with Marked Landmark Points, Omri Goldstein. (online), accessed: 2019-11-20
National Science Foundation, “Cyber-Physical Systems (CPS) - nsf20563”, April 2020
Y. Wang, Y. Shen, and G. Zhang. 2016. Research on intrusion detection model using ensemble learning methods. In 2016 7th IEEE International Conference on Software Engineering and Service Science (ICSESS). IEEE, 422–425.
B. A. Tama, M. Comuzzi, and K. H. Rhee. 2019. TSE-IDS: A two-stage classifier ensemble for intelligent anomaly-based intrusion detection system. IEEE Access 7 (2019), 94497–94507.
O. Oriola. 2020. A stacked generalization ensemble approach for improved intrusion detection. International Journal of Computer Science and Information Security (IJCSIS) 18 (2020), 5.
I. P. Possebon, A. S. Silva, L. Z. Granville, A. Schaeffer-Filho, and A. Marnerides. 2019. Improved network traffic classification using ensemble learning. In 2019 IEEE Symposium on Computers and Communications (ISCC). IEEE, 1–6.
A. Alaba, S. Maitanmi, and O. Ajayi. 2019. An ensemble of classification techniques for intrusion detection systems. International Journal of Computer Science and Information Security (IJCSIS) 17, 11 (2019)
S. Rajagopal, P. P. Kundapur, and K. S. Hareesha. 2020. A stacking ensemble for network intrusion detection using heterogeneous datasets. Security and Communication Networks, 2020.
G. O. Campos, A. Zimek, J. Sander, R. J. Campello, B. Micenko-va, E. Schubert, I. Assent, and M. E. Houle. 2016. On the evaluation of outlier detection: Measures, datasets, and an empirical study. In Lernen, Wissen, Daten, Analysen 2016. CEUR work-shop proceedings, 2016.
D. M. Powers. 2011. Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation, 2011
D. Chicco and G. Jurman. 2020. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21, 1 (2020), 6.
T. Zoppi, A. Ceccarelli, and A. Bondavalli. 2019. Evaluation of anomaly detection algorithms made easy with RELOAD. In Proceedings of the 30th Int. Symposium on Soft-ware Reliability Engineering (ISSRE). IEEE, 446–455.
B. Azhagusundari and Antony Selvadoss Thanamani. 2013. Feature selection based on information gain. International Journal of Innovative Technology and Exploring Engineering (IJITEE) 2, 2 (2013), 18–21.
Goldstein Markus and Andreas Dengel. 2012. Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm. KI-2012: Poster /Demo Track (2012), 59–63.
H-P. Kriegel and A. Zimek. Angle-based outlier detection in high-dimensional data. Proc. of the 14th ACM SIGKDD Int. Conf. on Knowledge Discovery Data Mining; ‘08. 444–452.
V. Hautamaki, I. Karkkainen, and P. Franti. 2004. Outlier detection using k-nearest neighbour graph. Pattern Recognition. ICPR 2004. Proceedings of the 17th Int. Conference on Vol. 3. IEEE, 430-433.
M. Amer, M. Goldstein, and S. Abdennadher. 2013. Enhancing one-class support vector machines for unsupervised anomaly detection. In Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description. ACM, 2013, 8–15.
Vázquez Félix Iglesias, Tanja Zseby, and Arthur Zimek. 2018. Outlier detection based on low density models. 2018 IEEE Int. Conference on Data Mining Workshops (ICDMW). IEEE, 2018.
T. Kohonen. 1997. Exploration of very large databases by self-organizing maps. In Proc. of Int. Conference on Neural Networks (ICNN'97), Vol. 1. IEEE, PL1–PL6.
G. Hamerly and C. Elkan. 2004. Learning the k in k-means. In Advances in Neural Information Processing Systems. 281–288.
Mennatallah Amer and Markus Goldstein. 2012. Nearest-neighbor and clustering based anomaly detection algorithms for rapidminer. In Proceedings of the 3rd RapidMiner Community Meeting and Conference (RCOMM 2012).
Jian Tang, Zhixiang Chen, Ada Wai-Chee Fu, and David W Cheung. 2002. Enhancing effectiveness of outlier detections for low density patterns. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 535–548.
Martin Ester, Han-peter Kriegel, Jorg Sander, and Xiaowei Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. 2nd Int. Conference on Knowledge Discovery and Data Mining (KDD-96).
M. M. Breunig, H. P. Kriegel, R. T. Ng, and J. Sander. 2000. LOF: Identifying density-based local outliers. In ACM Sigmod Record, Vol. 29. ACM, 93–104.
M. Radovanović, A. Nanopoulos, and M. Ivanović. 2014. Reverse nearest neighbors in unsupervised distance-based outlier detection. IEEE Transactions on Knowledge and Data Engineering 27, 5 (2014), 1369–1382.
F. T. Liu, K. M. Ting, and Z. H. Zhou. 2008. Isolation forest. In 2008 8th IEEE Int. Conference on Data Mining. IEEE, 413–422.
J. H. M. Janssens, F. Huszar, E. O. Postma, and H. J. van den Herik. 2012. Stochastic outlier selection. Technical Report TiCC TR 2012-001, Tilburg University, Tilburg Center for Cognition and Communication, Tilburg, The Netherlands.
J. A. Hartigan and M. A. Wong. 1979. Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society. Series C, 28, 1 (1979), 100–108.
Stan Z. Li. 2009. Encyclopedia of biometrics: I-Z. Vol. 2. Springer Science & Business Media, 2009.
Roberts Chris. 2007. Biometric attack vectors and defences. Computers & Security 26, 1 (2007), 14–25.
W. Dahea and H. S. Fadewar. 2018. Multimodal biometric system: A review. International Journal of Research in Advanced Engineering and Technology 4, 1 (2018), 25–31.
Archive of full metric scores (online). Retrieved on 15 December, 2020 from
Robert E. Schapire. 2013. Explaining adaboost. Empirical inference. Springer, Berlin, 37–52.
T. Zoppi, A. Ceccarelli, L. Salani, and A. Bondavalli. 2020. On the educated selection of unsupervised algorithms via attacks and anomaly classes. Journal of Information Security and Applications 52, 102474.
Hindy Hanan et al. 2018. A taxonomy and survey of intrusion detection system design techniques, network threats and datasets. arXiv preprint arXiv:1806.03517 (2018).
A. Khraisat, I. Gondal, P. Vamplew, and J. Kamruzzaman. 2019. Survey of intrusion detection systems: Techniques, datasets and challenges. Cybersecurity 2, 1 (2019), 20.
S. Boughorbel, F. Jarray, and M. El-Anbari. 2017. Optimal classifier for imbalanced data using Matthews correlation coefficient metric. PloS One 12, 6 (2017), e0177678.
Q. Zhu. 2020. On the performance of Matthews correlation coefficient (MCC) for imbalanced dataset. Pattern Recognition Letters.
M. Lopez-Martin, B. Carro, and A. Sanchez-Esguevillas. 2020. IoT type-of-traffic forecasting method based on gradient boosting neural networks. Future Generation Computer Systems 105, 331–345.
A. Bansal and S. Kaur. 2018. Extreme gradient boosting based tuning for classification in intrusion detection systems. In International Conference on Advances in Computing and Data Sciences. Springer, Singapore, 372–380.
M. Z. Alom and T. M. Taha. 2017. Network intrusion detection for cyber security using unsupervised deep learning approaches. 2017 IEEE National Aerospace and Electronics Conference (NAECON), Dayton, OH, 2017, 63–69.
Zavrak Sultan and Murat İskefiyeli. 2020. Anomaly-based intrusion detection from network flow features using variational autoencoder. IEEE Access 8 (2020), 108346–108358.
Y. Zhang, P. Li, and X. Wang. 2019. Intrusion detection for IoT based on improved genetic algorithm and deep belief network. IEEE Access 7, 31711–31722.
L. Torrey and J. Shavlik. 2010. Transfer learning. In Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques. IGI global, 242–264.

Cited By

View all
  • (2025)Cascading Bagging and Boosting Ensemble Methods for Intrusion Detection in Cyber‐Physical SystemsSecurity and Privacy10.1002/spy2.4978:1Online publication date: 12-Jan-2025
  • (2024)Hybrid intelligent technique for intrusion detection in cyber physical systems with improved feature setJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23640046:2(3411-3427)Online publication date: 14-Feb-2024
  • (2024)Improving the accuracy of Anomaly Detection in Multimodal Sensors using 1D-CNNProceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments10.1145/3652037.3652052(212-221)Online publication date: 26-Jun-2024
  • Show More Cited By



Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors


Published In

cover image ACM Transactions on Cyber-Physical Systems
ACM Transactions on Cyber-Physical Systems  Volume 5, Issue 4
October 2021
312 pages
  • Editor:
  • Chenyang Lu
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].


Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 22 September 2021
Accepted: 01 May 2021
Revised: 01 December 2020
Received: 01 August 2020
Published in TCPS Volume 5, Issue 4


Request permissions for this article.

Check for updates

Author Tags

  1. Critical systems
  2. intrusion detection
  3. machine learning
  4. meta-learning
  5. security
  6. reliability


  • Research-article
  • Refereed

Funding Sources

  • H2020 programme under the Marie Sklodowska-Curie


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)432
  • Downloads (Last 6 weeks)53
Reflects downloads up to 13 Feb 2025

Other Metrics


Cited By

View all
  • (2025)Cascading Bagging and Boosting Ensemble Methods for Intrusion Detection in Cyber‐Physical SystemsSecurity and Privacy10.1002/spy2.4978:1Online publication date: 12-Jan-2025
  • (2024)Hybrid intelligent technique for intrusion detection in cyber physical systems with improved feature setJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23640046:2(3411-3427)Online publication date: 14-Feb-2024
  • (2024)Improving the accuracy of Anomaly Detection in Multimodal Sensors using 1D-CNNProceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments10.1145/3652037.3652052(212-221)Online publication date: 26-Jun-2024
  • (2024)Automated Machine Learning Configuration to Learn Intrusion Detectors on Attack-Free Datasets2024 IEEE 49th Conference on Local Computer Networks (LCN)10.1109/LCN60385.2024.10639690(1-7)Online publication date: 8-Oct-2024
  • (2024)Meta-Reinforcement Learning for Adaptation in Dynamic Cyber Security Environment2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT61001.2024.10725126(1-7)Online publication date: 24-Jun-2024
  • (2024)A Multi-Level Network Traffic Classification in Combating Cyberattacks Using Stack Deep Learning Models2024 8th Cyber Security in Networking Conference (CSNet)10.1109/CSNet64211.2024.10851735(143-146)Online publication date: 4-Dec-2024
  • (2024)A Systematic Literature Review on AI-Based Methods and Challenges in Detecting Zero-Day AttacksIEEE Access10.1109/ACCESS.2024.345541012(144150-144163)Online publication date: 2024
  • (2024)Anomaly-based error and intrusion detection in tabular dataFuture Generation Computer Systems10.1016/j.future.2024.06.051160:C(951-965)Online publication date: 1-Nov-2024
  • (2024)Robust Botnet Detection Approach for Known and Unknown Attacks in IoT Networks Using Stacked Multi-classifier and Adaptive ThresholdingArabian Journal for Science and Engineering10.1007/s13369-024-08742-y49:9(12561-12577)Online publication date: 14-Feb-2024
  • (2023)Survey on Intrusion Detection Systems Based on Machine Learning Techniques for the Protection of Critical InfrastructureSensors10.3390/s2305241523:5(2415)Online publication date: 22-Feb-2023
  • Show More Cited By

View Options

View options


View or Download as a PDF file.



View online with eReader.


HTML Format

View this article in HTML Format.

HTML Format

Login options

Full Access






Share this Publication link

Share on social media