Evaluation of Machine Learning Techniques for Traffic Flow-Based Intrusion Detection
Abstract
:1. Introduction
- An extensive benchmark study on the different machine learning techniques applied over the CICIDS2017 dataset to determine which ones better perform considering both classification scores and execution times.
- Evaluate the differences between multiclass classification (distinguishing between different types of attack) and binary classification (distinguishing only between normal traffic and attack flows).
- Evaluate the use of a correlation-based feature selection (CFS) technique for traffic flow attributes.
- Evaluate the use of Zeek to extract the traffic flows and attributes and compare the classification results obtained with respect to those obtained using the original CICIDS2017 dataset flows and attributes.
2. Materials and Methods
2.1. Intrusion Detection Datasets
- DARPA [18,19]. DARPA 1998/99 datasets, created at the MIT Lincoln Lab within an emulated network environment, are probably the most popular data sets for intrusion detection. DARPA 1998 and DARPA 1999 datasets contain seven and five weeks, respectively, of network traffic in packet-based format, including different types of attacks, e.g., DoS, port scans, buffer overflow, or rootkits. These datasets have been criticized due to artificial attack injections or redundancies [20,21].
- NSL-KDD [21]. It is an improved version of KDDCUP99 (which is based on the DARPA dataset) developed in the University of California. To avoid the large amount of redundancy in KDDCUP99, the authors of NSL-KDD removed duplicates from the KDDCUP99 dataset and created more sophisticated predefined subsets divided into training and test subsets. NSL-KDD uses the same attributes as KDDCUP99 and belongs to the third format category described before. The underlying network traffic of NSL-KDD dates back to the year 1998 but it has been a reference until today.
- ISCX 2012 [22]. This dataset was created in the Canadian Institute for Cybersecurity by capturing traffic in an emulated network environment launched during one week. A systematic dynamic approach was proposed to generate an intrusion detection dataset including normal and malicious network behavior through the definition of different profiles. These profiles determine attack scenarios, such as SSH brute force, DoS, or DDoS, and normal user behavior, e.g., writing e-mails or browsing the web. The profiles were used to create the dataset both in packet-based and bidirectional flow-based format and its dynamic approach allows generation of new datasets.
- UGR’16 [23]. This dataset was created at the University of Granada and it is a unidirectional flow-based dataset, which relies on capturing periodic effects in an ISP environment along four months. IP addresses were anonymized, and the flows were labeled as normal, background, or attack. The attacks correspond either to explicitly performed attacks (botnet, DoS, and port scans) or to manually identified and labeled as attacks. Most of the traffic is labeled as background which could be normal or an attack.
- UNSW-NB15 [24]. This dataset created by the Australian Centre for Cyber Security includes normal and malicious network traffic in packet-based format captured in an emulated environment The dataset is also available in flow-based format with additional attributes. It contains nine different families of attacks, e.g., backdoors, DoS, exploits, fuzzers, or worms. UNSW-NB15 comes along with predefined splits for training and test.
- CICIDS2017 [25]. This dataset was created at the Canadian Institute for Cybersecurity within an emulated environment over a period of 5 days and contains network traffic in packet-based and bidirectional flow-based format (each flow characterized with more than 80 attributes). Normal user behavior is executed through scripts. The dataset contains a wide range of attack types, such as SSH brute force, heartbleed, botnet, DoS, DDoS, web and infiltration attacks. CICIDS2017 has been selected in this work due to its novelty and the properties considered in its development. It is described in more detail below.
2.2. CICIDS2017 Dataset
- Brute force attack: it is basically a hit and try attack, which can be used for password cracking and also to discover hidden pages and content in a web application.
- Heartbleed attack: it is performed by sending a malformed heartbeat request with a small payload and large length field to the vulnerable party (e.g., a server) in order to trigger the victim’s response. It comes from a bug in the OpenSSL cryptography library of the transport layer security (TLS) protocol.
- Botnet: Networks of hijacked computers (named zombies) under cybercriminal control used to carry out various scams, spam campaigns and cyberattacks. Zombie computers communicates with a command-and-control center to receive attack instructions.
- DoS Attack: The attacker tries to make the machine or network resource temporally unavailable. It is usually achieved by flooding the target machine or resource with superfluous requests in an attempt to overload systems and prevent some or all legitimate requests from being fulfilled.
- DDoS attack: The result of multiple compromised systems that flood the targeted system by generating large network traffic, overflowing the bandwidth or resources of the victim.
- Web attack: Targets vulnerabilities in websites to gain unauthorized access, obtain confidential information, introduce malicious content, or alter the website’s content. In SQL injection, the attacker uses a string of SQL commands to force the database to reply. In cross-site scripting (XSS), attackers find the possibility of script injection when the code is not properly tested, and in Brute Force over HTTP they try a list of passwords to find the administrator’s password.
- Infiltration attack: It is usually applied by exploiting vulnerable software in the target’s computer. After successful exploitation, a backdoor is executed on the victim’s computer which can perform other attacks on the network (full port scan, service detection, etc.).
2.3. Machine Learning Techniques for Classification
- Naive Bayes: This classifier is a probabilistic method based on the computation of conditional probabilities and Bayes’ theorem. It is described as naive due to the simplifications that determine the independence hypothesis of the predictor variables, i.e., naive Bayes starts from the hypothesis that all the attributes are independent of each other [28,29].
- Logistic: In the multinomial logistic regression model, the logistic regression method is generalized for multiclass problems and is therefore used to predict the probabilities of the different possible outcomes of a categorical distribution as a dependent variable (class), from of a set of independent variables (attributes) [28].
- Multi-layer perceptron (MLP): This is an artificial neural network made up of multiple layers, which allows solving problems that are not linearly separable [28]. In the network used, three layers have been considered: an input layer where the values of the attributes are introduced, a hidden layer to which all the input nodes are connected; and an output layer, in which the classification values of the instances are obtained according to the classes.
- Sequential minimal optimization (SMO): This is a classifier in which John Platt’s sequential minimum optimization algorithm [30] is used in order to train a support vector machine classifier (SVM).
- k-nearest neighbors (IBk): The k-NN method is based on classifying an instance according to the classes of the k-most similar training instances. It is a non-parametric classification method that allows estimating the probability density function or directly the a posteriori probability that an instance belongs to a class using the information provided by the set of classified instances [28,31].
- Adaptive boosting (AB): This is a meta-algorithm for statistical classification that can be used in combination with other learning algorithms to improve the performance [28,32]. In this way, the output of the other learning algorithms (weak learners) is combined into a weighted sum that represents the final output of the non-linear classifier.
- OneR: This is one of the simplest and fastest classification algorithms, although sometimes its results are surprisingly good compared to much more complex algorithms [33]. In OneR, a rule is generated for each attribute and the one that provides with the least error in the classification is selected.
- J48 algorithm: This algorithm is an implementation of C4.5 [34], one of the most widely used in many data mining applications, and is based on the generation of a decision tree from of the data by partitioning recursively. The procedure to generate the decision tree consists in selecting one of the attributes as the root of the tree and creating a branch with each of the possible values of that attribute. In each resulting branch, a new node is created, and the same process is repeated, i.e., another attribute is selected and a new branch is generated for each possible value of it. The process is repeated until all instances have been sorted through some path in the tree.
- PART: This algorithm constitutes a simplified method (partial) of the C4.5 algorithm and is based on the construction of rules [35]. In this algorithm, the branches of the decision tree are replaced by rules, and although the underlying operation is the same, in this case only the branches of the tree that are most effective in selecting a class are included, which facilitates the task of programming the classifier. In each iteration, a partial pruned C4.5 decision tree is built, and the best leaf obtained (the one that allows classifying the largest number of examples) is transformed into a decision rule; then the created tree is deleted. Each time a rule is generated, the instances covered by that rule are removed and rules continue to be generated until there are no instances left to classify. The strategy used allows for great flexibility and speed. Using PART does not generate a complete tree, but rather a partial one in which the construction and pruning functions are combined until a stable subtree that cannot be simplified is found, at which point the rule is generated from that subtree.
- Random Forest (RF): In this technique, random forests are built by packing sets of random trees [28,36]. Trees built using the algorithm consider a certain number of random features at each node, without performing any pruning. The algorithm works by randomly testing a multitude of models, so that hundreds of decision trees can be combined, and then each decision tree is trained on a different selection of instances. The final random forest predictions are made by averaging the predictions obtained for each individual tree. Using random forest can reduce the overfitting effect of individual decision trees at the expense of increasing the computational complexity.
2.4. Computational Complexity
2.5. Performance Metrics
2.6. Testbed Framework
3. Proposed Methodology
3.1. Classification of Traffic Using the Traffic Flows in CICIDS2017 Dataset
3.2. Classification Using the Traffic Flows in CICIDS2017 Dataset and Attributes Selection
3.3. Classification Using Zeek for Traffic Flows Detection from the Packets in CICIDS2017 Dataset
4. Evaluations and Results
4.1. Results of Classification Using the Traffic Flows in CICIDS2017 Dataset
4.2. Results of Classification Using the Traffic Flows in CICIDS2017 Dataset and Attributes Selection
4.3. Results of Classification Using Zeek for Flows Detection from Packets in CICIDS2017 Dataset
5. Discussion
5.1. General Observations about the Results
5.2. Comparison with Related Work
5.3. Threats to the Validity
- External validity: It addresses the generalizability of the research experimental process. The quality of the results obtained depends on the dataset used: the CICIDS2017 dataset. It was selected because it includes a wide range of up-to-date attacks and because it was developed considering most of the characteristics an IDS dataset should include. However, the experimental process should be rerun over different datasets to guarantee the generalizability of the classification models since results in this kind of studies can be very good (F1 values close to 1) but highly dependent on the dataset. On the other hand, the techniques used in this work for traffic flow-based intrusion detection could be generalizable to other subjects that share common properties, to name a few, in the area of software defect prediction models to predict the quality and reliability of a software system, or in data leakage protection systems (DLPs) to classify information in confidential or non-confidential categories, or to identify altered documents, since the nature of the inherent problem is associated with classification and feature selection tasks.
- Internal validity: It refers to the choice of prediction models and feature selection methods. In this work, ten classifiers and many different approaches were considered in the benchmark to make the study as broad as possible, including the use of the original labelled traffic flows in the dataset and the Zeek-based flows from the raw packet captures, the evaluation considering multi-class vs. binary classifications, the application of FS methods, etc. Anyway, some changes in attribute selection or tree models tuning will probably be needed if the models are applied to other datasets and more sophisticated classifiers and FS methods could also be deployed.
- Construct validity: It focuses on the choice of indicators used to evaluate the performance of classification models. Regarding this aspect, we estimated the following parameters for every classification test in the benchmark study: F1 score, accuracy, precision, recall, Cohen’s kappa coefficient, and MCC, among others. For the sake of simplicity and conciseness, we have selected the F1 score results instead of others because it represents a combined value of precision and recall, and thus it is the most commonly used in the literature to present traffic classification results in IDS. However, we suggest that the use of other indicators, such as Cohen’s kappa coefficient or MCC, could be more appropriate to estimate the classifier performance (the improvement provided with respect to the accuracy that would occur by mere chance) and with a symmetric behavior in imbalanced classes.
6. Conclusions and Future Work
Author Contributions
Funding
Conflicts of Interest
References
- Check Point Research: Third Quarter of 2022 Reveals Increase in Cyberattacks and Unexpected Developments in Global Trends. Available online: https://blog.checkpoint.com/2022/10/26/third-quarter-of-2022-reveals-increase-in-cyberattacks/ (accessed on 31 October 2022).
- di Pietro, R.; Mancini, L.V. Intrusion Detection Systems; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2008; Volume 38, ISBN 978-0-387-77265-3. [Google Scholar]
- Kumar, S.; Gupta, S.; Arora, S. Research Trends in Network-Based Intrusion Detection Systems: A Review. IEEE Access 2021, 9, 157761–157779. [Google Scholar] [CrossRef]
- García-Teodoro, P.; Díaz-Verdejo, J.; Maciá-Fernández, G.; Vázquez, E. Anomaly-based network intrusion detection: Techniques, systems and challenges. Comput. Secur. 2009, 28, 18–28. [Google Scholar] [CrossRef]
- El-Maghraby, R.T.; Elazim, N.M.A.; Bahaa-Eldin, A.M. A survey on deep packet inspection. In Proceedings of the 12th International Conference on Computer Engineering and Systems (ICCES), Cairo, Egypt, 19–20 December 2017; pp. 188–197. [Google Scholar] [CrossRef]
- Umer, M.F.; Sher, M.; Bi, Y. Flow-based intrusion detection: Techniques and challenges. Comput. Secur. 2017, 70, 238–254. [Google Scholar] [CrossRef]
- Cisco IOS NetFlow. Available online: https://www.cisco.com/c/en/us/products/ios-nx-os-software/ios-netflow/index.html (accessed on 31 October 2022).
- Zeek Documentation. Available online: https://docs.zeek.org/en/master/about.html (accessed on 31 October 2022).
- Buczak, A.L.; Guven, E. A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutor. 2016, 18, 1153–1176. [Google Scholar] [CrossRef]
- Wang, S.; Balarezo, J.F.; Kandeepan, S.; Al-Hourani, A.; Chavez, K.G.; Rubinstein, B. Machine learning in network anomaly detection: A survey. IEEE Access 2021, 9, 152379–152396. [Google Scholar] [CrossRef]
- Ahmed, M.; Mahmood, A.N.; Hu, J. A survey of network anomaly detection techniques. J. Netw. Comput. Appl. 2016, 60, 19–31. [Google Scholar] [CrossRef]
- Bhuyan, M.H.; Bhattacharyya, D.K.; Kalita, J.K. Network anomaly detection: Methods systems and tools. IEEE Commun. Surv. Tuts. 2013, 16, 303–336. [Google Scholar] [CrossRef]
- Tsaia, C.; Hsub, Y.; Linc, C.; Lin, W. Intrusion detection by machine learning: A review. Expert Syst. Appl. 2009, 36, 11994–12000. [Google Scholar] [CrossRef]
- Ilyas, M.U.; Alharbi, S.A. Machine learning approaches to network intrusion detection for contemporary internet traffic. Computing 2022, 104, 1061–1076. [Google Scholar] [CrossRef]
- Alshammari, A.; Aldribi, A. Apply machine learning techniques to detect malicious network traffic in cloud computing. J. Big Data 2021, 8, 90. [Google Scholar] [CrossRef]
- Ring, M.; Wunderlich, S.; Scheuring, D.; Landes, D.; Hotho, A. A survey of network-based intrusion detection data sets. Comput. Secur. 2019, 86, 147–167. [Google Scholar] [CrossRef] [Green Version]
- Thakkar, A.; Lohiya, R.R. A review of the advancement in intrusion detection datasets. Procedia Comput. Sci. 2020, 167, 636–645. [Google Scholar] [CrossRef]
- Lippmann, R.P.; Fried, D.J.; Graf, I.; Haines, J.W.; Kendall, K.R.; McClung, D.; Weber, D.; Webster, S.E.; Wyschogrod, D.; Cunningham, R.K.; et al. Evaluating intrusion detection systems: The 1998 DARPA off-line intrusion detection evaluation. DARPA Inf. Surviv. Conf. Expo. 2000, 3, 12–26. [Google Scholar] [CrossRef] [Green Version]
- Lippmann, R.; Haines, J.W.; Fried, D.J.; Korba, J.; Das, K. The 1999 DARPA off-line intrusion detection evaluation. Comput. Netw. 2000, 34, 579–595. [Google Scholar] [CrossRef]
- McHugh, J. Testing intrusion detection systems: A critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln laboratory. ACM Trans. Inf. Syst. Secur. 2000, 3, 262–294. [Google Scholar] [CrossRef]
- Tavallaee, M.; Bagheri, E.; Lu, W.; Ghorbani, A.A. A detailed analysis of the KDD CUP 99 data set. In Proceedings of the IEEE Symposium on Computational Intelligence for Security and Defense Applications, Ottawa, ON, Canada, 8–10 July 2009; pp. 1–6. [Google Scholar] [CrossRef] [Green Version]
- Shiravi, A.; Shiravi, H.; Tavallaee, M.; Ghorbani, A.A. Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput. Secur. 2012, 31, 357–374. [Google Scholar] [CrossRef]
- Maciá-Fernández, G.; Camacho, J.; Magán-Carrión, R.; García-Teodoro, P.; Therón, R. UGR’16: A new dataset for the evaluation of cyclostationarity-based network IDSs. Comput. Secur. 2018, 73, 411–424. [Google Scholar] [CrossRef] [Green Version]
- Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems. In Proceedings of the Military Communications and Information Systems Conference (MilCIS), Canberra, ACT, Australia, 10–12 November 2015; pp. 1–6. [Google Scholar] [CrossRef]
- Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic char-acterization. In Proceedings of the International Conference on Information Systems Security and Privacy (ICISSP), FunchalMadeira, Portugal, 22–24 January 2018; pp. 108–116. [Google Scholar] [CrossRef]
- Sharafaldin, I.; Gharib, A.; Lashkari, A.H.; Ghorbani, A.A. Towards a reliable intrusion detection benchmark dataset. J. Softw. Netw. 2017, 2017, 177–200. [Google Scholar] [CrossRef]
- CICFlow Meter Tool. Available online: https://www.unb.ca/cic/research/applications.html (accessed on 10 October 2022).
- Kubat, M. An Introduction to Machine Learning; Springer International Publishing: Berlin/Heidelberg, Germany, 2021. [Google Scholar] [CrossRef]
- John, G.H. Estimating continuous distributions in bayesian classifiers. In UAI’95: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, Montreal, QC, Canada, 18–20 August 1995; Morgan Kaufmann: Burlington, MA, USA, 1995. [Google Scholar]
- Platt, J.C. Fast training of support vector machines using sequential minimal optimization. In Proceedings of the 2008 3rd International Conference on Intelligent System and Knowledge Engineering, Xiamen, China, 17–19 November 2008. [Google Scholar] [CrossRef]
- Aha, D.W.; Kibler, D.; Albert, M.K. Instance-based learning algorithms. Mach. Learn. 1991, 6, 37–66. [Google Scholar] [CrossRef] [Green Version]
- Freund, Y.; Schapire, R.E. Experiments with a new boosting algorithm. In ICML’96: Proceedings of the Thirteenth International Conference on International Conference on Machine Learning, Bari, Italy, 3–6 July 1996; Morgan Kaufmann Publishers Inc.: Burlington, MA, USA, 1996; pp. 148–156. ISBN 978-1-55860-419-3. [Google Scholar]
- Holte, R.C. Very simple classification rules perform well on most commonly used datasets. Mach. Learn. 1993, 11, 63–90. [Google Scholar] [CrossRef]
- Ross Quinlan, J. Programs for Machine Learning; Kaufmann Publishers: Burlington, MA, USA, 1994. [Google Scholar]
- Frank, E.; Witten, I.H. Generating accurate rule sets without global optimization. In ICML ’98: Proceedings of the Fifteenth International Conference on Machine Learning; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 1998. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Witten, I.H.; Frank, E. Data Mining: Practical Machine Learning Tools and Techniques, 2nd ed.; Morgan Kaufmann: San Francisco, CA, USA, 2005. [Google Scholar]
- Frank, E.; Hall, M.A.; Witten, I.H. WEKA Workbench Online Appendix for “Data Mining: Practical Machine Learning Tools and Techniques” Morgan Kaufmann, 4th ed.; Goodreads Inc.: Santa Monica, CA, USA, 2016. [Google Scholar]
- Alshammari, R.; Zincir-Heywood, A.N. A flow based approach for SSH traffic detection. In Proceedings of the 2007 IEEE International Conference on Systems, Man and Cybernetics, Montreal, QC, Canada, 7–10 October 2007. [Google Scholar]
- Elijah, A.V.; Abdullah, A.; JhanJhi, N.Z.; Supramaniam, M. Ensemble and Deep-Learning Methods for Two-Class and Multi-Attack Anomaly Intrusion Detection: An Empirical Study. (IJACSA) Int. J. Adv. Comput. Sci. Appl. 2019, 10, 9. [Google Scholar] [CrossRef] [Green Version]
- Khalid, S.; Khalil, T.; Nasreen, S. A Survey of Feature Selection and Feature Extraction Techniques in Machine Learning. In Proceedings of the 2014 Science and Information Conference (SAI), London, UK, 27–29 August 2014; IEEE: Piscataway, NJ, USA; pp. 372–378. [Google Scholar]
- Wah, Y.B.; Ibrahim, N.; Hamid, H.A.; Abdul-Rahman, S.; Fong, S. Feature Selection Methods: Case of Filter and Wrapper Approaches for Maximising Classification Accuracy. Pertanika J. Sci. Technol. 2018, 26, 329–340. [Google Scholar]
- Guyon, I.; Gunn, S.; Nikravesh, M.; Zadeh, L.A. Feature Extraction: Foundations and Applications. Series Studies in Fuzziness and Soft Computing; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
- Balogun, A.O.; Basri, S.; Abdulkadir, S.J.; Hashim, A.S. Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach. Appl. Sci. 2019, 9, 2764. [Google Scholar] [CrossRef] [Green Version]
- Balogun, A.O.; Basri, S.; Mahamad, S.; Abdulkadir, S.J.; Almomani, M.A.; Adeyemo, V.E.; Al-Tashi, Q.; Mojeed, H.A.; Imam, A.A.; Bajeh, A.O. Impact of Feature Selection Methods on the Predictive Performance of Software Defect Prediction Models: An Extensive Empirical Study. Symmetry 2020, 12, 1147. [Google Scholar] [CrossRef]
- Nguyen, H.; Franke, K.; Petrovic, S. Improving effectiveness of intrusion detection by correlation feature selection. In Proceedings of the International Conference on Availability, Reliability, and Security (ARES), Krakow, Poland, 5–18 February 2010. [Google Scholar] [CrossRef]
- Hall, M.A. Correlation-Based Feature Selection for Machine Learning. Doctoral Dissertation, University of Waikato, Hamilton, New Zealand, 1999. [Google Scholar]
- Engelen, G.; Rimmer, V.; Joosen, W. Troubleshooting an intrusion detection dataset: The CICIDS2017 case study. In Proceedings of the 2021 IEEE Symposium on Security and Privacy Workshops, SPW, San Francisco, CA, USA, 27–27 May 2021; pp. 7–12. [Google Scholar] [CrossRef]
- Rosay, A.; Cheval, E.; Carlier, F.; Leroux, P. Network intrusion detection: A comprehensive analysis of CIC-IDS2017. In Proceedings of the 8th International Conference on Information Systems Security and Privacy (ICISSP 2022), Online, 9–11 February 2022; pp. 25–36. [Google Scholar] [CrossRef]
- Abdulhammed, R.; Musafer, H.; Alessa, A.; Faezipour, M.; Abuzneid, A. Features dimensionality reduction approaches for machine learning based network intrusion detection. Electronics 2019, 8, 322. [Google Scholar] [CrossRef] [Green Version]
- Stiawan, D.; Idris, M.Y.B.; Bamhdi, A.M.; Budiarto, R. CICIDS-2017 dataset feature analysis with information gain for anomaly detection. IEEE Access 2020, 8, 132911–132921. [Google Scholar] [CrossRef]
- Meemongkolkiat, N.; Suttichaya, V. Analysis on network traffic features for designing machine learning based IDS. J. Phys. Conf. Series. 1993, 1, 012029. [Google Scholar] [CrossRef]
Files | Benign Flows and Attacks |
---|---|
Monday-Working Hours | Benign (Normal human activities) |
Tuesday-Working Hours | Benign, FTP-Patator, SSH-Patator |
Wednesday-Working Hours | Benign, DoS GoldenEye, DoS Hulk, DoS Slowhttptest, DoS slowloris, Heartbleed |
Thursday-Working Hours-Morning-WebAttacks | Benign, Web Attack-Brute Force, Web Attack-SQL Injection, Web Attack-XSS |
Thursday-Working Hours-Afternoon-Infilteration | Benign, Infiltration |
Friday-Working Hours-Morning | Benign, Bot |
Friday-Working Hours-Afternoon-PortScan | Benign, PortScan |
Friday-Working Hours-Afternoon-DDoS | Benign, DDoS |
Labeled Flow | Number of Flows |
---|---|
Benign | 2,359,087 |
DoS Hulk | 231,072 |
PortScan | 158,930 |
DDoS | 41,835 |
DoS GoldenEye | 10,293 |
FTP-Patator | 7938 |
SSH-Patator | 5897 |
DoS Slowloris | 5796 |
DoS Slowhttptest | 5499 |
Bot | 1966 |
Web Attack-Brute Force | 1507 |
Web Attack-XSS | 652 |
Infiltration | 36 |
Web Attack-SQL Injection | 21 |
Heartbleed | 11 |
Algorithm | Train Time Complexity | Test Time Complexity | |
---|---|---|---|
Naive Bayes | c number of classes | O(n⋅m⋅c) | O(m⋅c) |
Logistic regression model (Logistic) | O(n⋅m2 + m3) | O(m) | |
Multi-layer perceptron (MLP) | c number cycles, k number neurons | O(c⋅n⋅m⋅k) | O(m⋅k) |
Support vector classifier (SMO) | s number of support vectors | O(m⋅n2 + n3) | O(s⋅m) |
K-nearest neighbours classifier (IBk) | k number of neighbours | O(1) | O(n⋅m) |
Adaboost classifier (AB) | e number of estimators | O(n⋅m⋅e) | O(m⋅e) |
1R classifier (ONER) | <O(n⋅m⋅log n) | O(1) | |
Partial C4.5 decision tree (PART) | k depth of tree | O(n⋅m⋅log n) | O(k) |
C4.5 decision tree (J48) | k depth of tree | O(n2⋅m) | O(k) |
Random Forest (RF) | t number of trees k depth of tree | O(n⋅m⋅t⋅log n) | O(k⋅t) |
Class\Prediction | Attack | Normal |
---|---|---|
Attack | TP | FN |
Normal | FP | TN |
File Name | Attributes |
---|---|
Friday Afternoon DDoS | Total Length of Fwd Packets, Total Length of Bwd Packets, Fwd Packet Length Max, Bwd Packet Length Min, URG Flag Count, Subflow Bwd Bytes, Init_Win_Bytes_Forward, Min_Seg_Size_Forward |
Friday Afternoon PortScan | Bwd Packet Length Mean, PSH Flag Count, Init_Win_Bytes_Backward, Act_Data_Pkt_Fwd, Min_Seg_Size_Forward |
Friday Morning Bot | Bwd Packet Length Mean, Bwd Packet Length Std, Fwd IAT Max, Packet Length Std, Avg Bwd Segment Size, Init_Win_Bytes_Backward, Min_Seg_Size_Forward |
Thursday Afternoon Infiltration | Total Length of Fwd Packets, Active Std, Active Min, Idle Std |
Thursday Morning Web Attacks | Fwd Packet Length Mean, Fwd IAT Min, Init_Win_Bytes_Backward |
Tuesday | Fwd Packet Length Mean, Fwd Packet Length Std, Bwd Packet Length Std, Fwd PSH Flags, Avg Fwd Segment Size, Init_Win_Bytes_Forward, Init_Win_Bytes_Backward, Min_Seg_Size_Forward |
Wednesday DoS | Total Length of Bwd Packets, Bwd Packet Length Mean, Min Packet Length, Init_Win_Bytes_Forward, Init_Win_Bytes_Backward, Idle Max |
File Name | Attributes |
---|---|
All files | Bwd Packet Length Mean, Bwd Packet Length Std, Packet Length Std, Init_Win_Bytes_Forward, Init_Win_Bytes_Backward, Active Mean |
Attributes | ||
---|---|---|
TimeStamp | Duration | Source IP bytes |
Source IP | Source Bytes | Response Packets |
Source Port | Response Bytes | Response IP Bytes |
Destination IP | Conn_state | Tunnel parents |
Destination Port | Missed Bytes | Label |
Protocol | History | |
Service | Source Packets |
File | CICIDS2017 Flows | Zeek Flows |
---|---|---|
Tuesday | 445,909 | 322,676 (original) |
Wednesday | 691,703 | 508,801 (original with DoS Hulk) 345,347 (without DoS Hulk) |
Thursday | 170,366 | 113,005 (without Infiltration) |
Friday | 704,245 | 546,795 (original) 542,567 (PortScan fixed) 546,790 (Bot fixed) 542,562 (PortScan and Bot fixed) |
File | Label | NB | Logist | MLP | SMO | IBk | AB | ONER | PART | J48 | RF |
---|---|---|---|---|---|---|---|---|---|---|---|
Friday Afternoon DDoS | BENIGN | 0.937 | 0.999 | 0.980 | 0.980 | 1 | 0.997 | 0.988 | 1 | 1 | 1 |
DDos | 0.956 | 0.999 | 0.986 | 0.985 | 1 | 0.998 | 0.991 | 1 | 1 | 1 | |
Friday Afternoon PortScan | BENIGN | 0.898 | 0.999 | 0.992 | 0.995 | 1 | 0.998 | 0.995 | 1 | 1 | 1 |
PortScan | 0.929 | 0.999 | 0.994 | 0.996 | 1 | 0.998 | 0.996 | 1 | 1 | 1 | |
Friday Morning Bot | BENIGN | 0.904 | 0.995 | 0.995 | 0.996 | 0.999 | 0.998 | 0.998 | 1 | 1 | 1 |
Bot | 0.107 | 0.995 | 0.029 | 0.397 | 0.938 | 0.709 | 0.727 | 0.981 | 0.966 | 0.971 | |
Thursday Afternoon Infiltration | BENIGN | 0.997 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Infiltration | 0.025 | 0.389 | ? = 0 | 0.353 | 0.500 | ? = 0 | 0.211 | 0.762 | 0.444 | 0.667 | |
Thursday Morning Web Attacks | BENIGN | 0.925 | 0.999 | 0.993 | 0.993 | 1 | 0.993 | 0.998 | 1 | 1 | 1 |
Web Attack Brute Force | 0.021 | 0.769 | 0 | 0 | 0.728 | ? = 0 | 0.685 | 0.811 | 0.812 | 0.750 | |
Web Attack XSS | 0.417 | 0.102 | ? = 0 | ? = 0 | 0.385 | ? = 0 | 0.038 | 0.104 | 0.102 | 0.331 | |
Web Attack SQL Injection | 0.009 | 0.105 | ? = 0 | ? = 0 | 0.111 | ? = 0 | ? = 0 | 0.667 | 0.455 | 0.133 | |
Tuesday | BENIGN | 0.878 | 0.996 | 0.984 | 0.990 | 1 | 0.984 | 0.996 | 1 | 1 | 1 |
FTP Patator | 0.262 | 0.989 | ? = 0 | 0.587 | 0.997 | ? = 0 | 0.987 | 0.999 | 0.998 | 0.999 | |
SSH Patator | 0.189 | 0.651 | ? = 0 | 0.633 | 0.986 | ? = 0 | 0.634 | 0.998 | 0.994 | 0.997 | |
Wednesday DoS | BENIGN | 0.755 | 0.989 | 0.96 | 0.965 | 0.999 | 0.96 | 0.954 | 1 | 0.999 | 0.999 |
DoS slowloris | 0.255 | 0.978 | ? = 0 | 0.875 | 0.989 | ? = 0 | 0.386 | 0.992 | 0.992 | 0.994 | |
DoS Slowhttptest | 0.234 | 0.948 | ? = 0 | 0.859 | 0.974 | ? = 0 | ? = 0 | 0.981 | 0.982 | 0.985 | |
DoS Hulk | 0.814 | 0.981 | 0.945 | 0.943 | 0.999 | 0.934 | 0.946 | 1 | 0.999 | 0.999 | |
DoS GoldenEye | 0.422 | 0.981 | ? = 0 | 0.939 | 0.996 | ? = 0 | 0.400 | 0.996 | 0.995 | 0.996 | |
Heartbleed | 0.833 | 0.480 | ? = 0 | 0.923 | 0.833 | ? = 0 | ? = 0 | 0.833 | 0.923 | 1 |
NB | Logist | MLP | SMO | IBk | AB | ONER | PART | J48 | RF | |
---|---|---|---|---|---|---|---|---|---|---|
Time to build the model (s) | 2.52 | 741 | 103 | 170 | 0.06 | 62.61 | 3.58 | 76.62 | 48.65 | 171 |
Time to test the model (s) | 3.91 | 0.47 | 0.29 | 0.31 | 1730 | 0.24 | 0.21 | 0.14 | 0.11 | 2.85 |
Number of flows/s classified in test phase | 28,867 | 240,153 | 389,214 | 364,103 | 65 | 470,300 | 537,485 | 806,228 | 1,026,109 | 39,604 |
a | b | c | d | ← Classified as |
---|---|---|---|---|
84,046 | 5 | 13 | 2 | a = BENIGN |
6 | 738 | 0 | 0 | b = Web Attack Brute Force |
8 | 330 | 20 | 1 | c = Web Attack XSS |
8 | 1 | 0 | 5 | d = Web Attack SQL Injection |
a | b | ← Classified as |
---|---|---|
84,048 | 18 | a = BENIGN |
20 | 1097 | b = ATTACK |
File | Label | NB | Logist | MLP | SMO | IBk | AB | ONER | PART | J48 | RF |
---|---|---|---|---|---|---|---|---|---|---|---|
Friday Afternoon DDoS | BENIGN | 0.937 | 0.999 | 0.980 | 0.980 | 1 | 0.997 | 0.988 | 1 | 1 | 1 |
DDos | 0.956 | 0.999 | 0.986 | 0.985 | 1 | 0.998 | 0.991 | 1 | 1 | 1 | |
Friday Afternoon PortScan | BENIGN | 0.898 | 0.999 | 0.992 | 0.995 | 1 | 0.998 | 0.995 | 1 | 1 | 1 |
PortScan | 0.929 | 0.999 | 0.994 | 0.996 | 1 | 0.998 | 0.996 | 1 | 1 | 1 | |
Friday Morning Bot | BENIGN | 0.904 | 0.995 | 0.995 | 0.996 | 0.999 | 0.998 | 0.998 | 1 | 1 | 1 |
Bot | 0.107 | 0.995 | 0.029 | 0.397 | 0.938 | 0.709 | 0.727 | 0.981 | 0.966 | 0.971 | |
Thursday Afternoon Infiltration | BENIGN | 0.997 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Infiltration | 0.025 | 0.389 | ? = 0 | 0.353 | 0.500 | ? | 0.211 | 0.762 | 0.444 | 0.667 | |
Thursday Morning Web Attacks | BENIGN | 0.926 | 0.999 | 0.993 | 0.996 | 1 | 0.996 | 0.998 | 1 | 1 | 1 |
ATTACK | 0.159 | 0.944 | 0 | 0.747 | 0.969 | 0.744 | 0.856 | 0.987 | 0.983 | 0.974 | |
Tuesday | BENIGN | 0.878 | 0.996 | 0.984 | 0.991 | 1 | 0.989 | 0.996 | 1 | 1 | 1 |
ATTACK | 0.227 | 0.868 | ? = 0 | 0.644 | 0.993 | 0.429 | 0.866 | 0.998 | 0.996 | 0.998 | |
Wednesday DoS | BENIGN | 0.871 | 0.970 | 0.970 | 0.965 | 0.999 | 0.979 | 0.954 | 1 | 0.999 | 0.999 |
ATTACK | 0.806 | 0.949 | 0.947 | 0.941 | 0.999 | 0.964 | 0.914 | 0.999 | 0.999 | 0.999 |
File | Label | NB | Logist | MLP | SMO | IBk | AB | ONER | PART | J48 | RF |
---|---|---|---|---|---|---|---|---|---|---|---|
All files | BENIGN | 0.709 | 0.961 | 0.935 | 0.954 | 0.999 | 0.964 | 0.966 | 0.999 | 0.999 | 0.999 |
ATTACK | 0.516 | 0.853 | 0.602 | 0.807 | 0.996 | 0.829 | 0.841 | 0.998 | 0.998 | 0.997 |
File | Label | NB | Logist | MLP | SMO | IBk | AB | ONER | PART | J48 | RF |
---|---|---|---|---|---|---|---|---|---|---|---|
Friday Afternoon DDoS | BENIGN | 0.750 | 0.966 | 0.964 | 0.942 | 0.999 | 0.998 | 0.988 | 0.999 | 0.999 | 0.999 |
DDos | 0.868 | 0.976 | 0.974 | 0.960 | 0.999 | 0.998 | 0.991 | 0.999 | 0.999 | 1 | |
Friday Afternoon PortScan | BENIGN | 0.899 | 0.987 | 0.991 | 0.988 | 0.999 | 0.991 | 0.992 | 0.999 | 0.999 | 0.999 |
PortScan | 0.930 | 0.990 | 0.993 | 0.990 | 0.999 | 0.993 | 0.994 | 0.999 | 0.999 | 0.999 | |
Friday Morning Bot | BENIGN | 0.794 | 0.995 | 0.995 | 0.995 | 0.998 | 0.995 | 0.998 | 0.998 | 0.998 | 0.998 |
Bot | 0.056 | 0.020 | 0.030 | ? = 0 | 0.792 | ? = 0 | 0.727 | 0.793 | 0.794 | 0.795 | |
Thursday Afternoon Infiltration | BENIGN | 0.995 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Infiltration | 0.015 | 0.435 | ? = 0 | 0.250 | 0.385 | ? = 0 | 0.303 | 0.500 | 0.3 | 0.667 | |
Thursday Morning Web Attacks | BENIGN | 0.229 | 0.993 | 0.993 | 0.993 | 1 | 0.993 | 0.998 | 1 | 1 | 1 |
Web Attack Brute Force | 0.001 | ? = 0 | ? = 0 | ? = 0 | 0.765 | ? = 0 | 0.685 | 0.801 | 0.803 | 0.758 | |
Web Attack XSS | 0.393 | ? = 0 | ? = 0 | ? = 0 | 0.265 | ? = 0 | 0.038 | 0.112 | 0.113 | 0.283 | |
Web Attack SQL Injection | 0.008 | ? = 0 | ? = 0 | ? = 0 | 0.125 | ? = 0 | ? = 0 | 0.400 | 0.222 | 0.133 | |
Tuesday | BENIGN | 0.970 | 0.984 | 0.984 | 0.984 | 1 | 0.984 | 0.996 | 1 | 1 | 1 |
FTP Patator | 0.945 | 0 | ? = 0 | ? = 0 | 0.999 | ? = 0 | 0.955 | 0.998 | 0.999 | 0.999 | |
SSH Patator | 0.321 | 0 | ? = 0 | ? = 0 | 0.997 | ? = 0 | 0.634 | 0.992 | 0.992 | 0.998 | |
Wednesday DoS | BENIGN | 0.790 | 0.920 | 0.900 | 0.898 | 0.999 | 0.899 | 0.954 | 0.999 | 0.999 | 0.999 |
DoS slowloris | 0.309 | 0.100 | ? = 0 | ? = 0 | 0.956 | ? = 0 | 0.386 | 0.958 | 0.956 | 0.960 | |
DoS Slowhttptest | 0.035 | 0 | ? = 0 | ? = 0 | 0.929 | ? = 0 | ? = 0 | 0.926 | 0.930 | 0.932 | |
DoS Hulk | 0.787 | 0.882 | 0.782 | 0.779 | 0.999 | 0.782 | 0.946 | 0.999 | 0.999 | 0.999 | |
DoS GoldenEye | 0.226 | 0.087 | ? = 0 | ? = 0 | 0.986 | ? = 0 | 0.400 | 0.993 | 0.991 | 0.993 | |
Heartbleed | 0.600 | 1 | ? = 0 | ? = 0 | 0.923 | ? = 0 | ? = 0 | 0.600 | 0.600 | 0.600 |
NB | Logist | MLP | SMO | IBk | AB | ONER | PART | J48 | RF | |
---|---|---|---|---|---|---|---|---|---|---|
Decrease in time to build the model (%) | 89 | 99 | 71 | 70 | 67 | 94 | 94 | 96 | 95 | 68 |
Decrease in time to test the model (%) | 86 | 72 | 66 | 74 | 19 | 63 | 71 | 36 | 45 | 58 |
File | Label | NB | Logist | MLP | SMO | IBk | AB | ONER | PART | J48 | RF |
---|---|---|---|---|---|---|---|---|---|---|---|
All files | BENIGN | 0.898 | 0.930 | 0.935 | 0.934 | 0.994 | 0.934 | 0.966 | 0.994 | 0.994 | 0.994 |
ATTACK | 0.504 | 0.577 | 0.605 | 0.602 | 0.976 | 0.603 | 0.841 | 0.976 | 0.976 | 0.976 |
File | Label | NB | Logist | MLP | SMO | IBk | AB | ONER | PART | J48 | RF |
---|---|---|---|---|---|---|---|---|---|---|---|
Tuesday | BENIGN | 0.859 | 1 | 0.989 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
ATTACK | 0.152 | 0.998 | ? = 0 | 0.998 | 0.998 | 0.996 | 0.987 | 0.997 | 0.997 | 0.998 | |
Wednesday (without Hulk) | BENIGN | 0.891 | 0.999 | 0.958 | 0.998 | 0.999 | 0.994 | 0.996 | 0.999 | 0.998 | 0.999 |
ATTACK | 0.471 | 0.985 | ? = 0 | 0.974 | 0.987 | 0.931 | 0.961 | 0.986 | 0.980 | 0.986 | |
Thursday (without Infiltration) | BENIGN | 0.892 | 0.997 | 0.991 | 0.991 | 0.995 | 0.996 | 0.996 | 0.997 | 0.997 | 0.996 |
ATTACK | 0.160 | 0.844 | ? = 0 | ? = 0 | 0.760 | 0.790 | 0.834 | 0.872 | 0.876 | 0.801 | |
Friday (PS and BOT fixed) | BENIGN | 0.905 | 0.997 | ? = 0 | 0.996 | 0.999 | 0.994 | 0.996 | 0.999 | 0.999 | 0.999 |
ATTACK | 0.912 | 0.997 | 0.642 | 0.995 | 0.999 | 0.993 | 0.995 | 0.999 | 0.999 | 0.999 |
File | Label | NB | Logist | MLP | SMO | IBk | AB | ONER | PART | J48 | RF |
---|---|---|---|---|---|---|---|---|---|---|---|
All files | BENIGN | 0.831 | 0.985 | 0.875 | 0.975 | 0.998 | 0.982 | 0.993 | 0.998 | 0.998 | 0.998 |
ATTACK | 0.663 | 0.980 | ? = 0 | 0.962 | 0.992 | 0.936 | 0.975 | 0.993 | 0.993 | 0.992 |
File | Label | PART | J48 | RF |
---|---|---|---|---|
Friday (original) | BENIGN | 0.998 | 0.998 | 0.997 |
ATTACK | 0.997 | 0.997 | 0.997 | |
Friday (PS fixed) | BENIGN | 0.999 | 0.999 | 0.999 |
ATTACK | 0.999 | 0.999 | 0.999 | |
Friday (BOT fixed) | BENIGN | 0.998 | 0.997 | 0.997 |
ATTACK | 0.997 | 0.997 | 0.997 | |
Friday (PS and BOT fixed) | BENIGN | 0.999 | 0.999 | 0.999 |
ATTACK | 0.999 | 0.999 | 0.999 |
File | Label | PART | J48 | RF |
---|---|---|---|---|
Wednesday (original) | BENIGN | 0.999 | 0.980 | 0.999 |
ATTACK | 0.998 | 0.997 | 0.998 | |
Wednesday (without Hulk) | BENIGN | 0.999 | 0.998 | 0.999 |
ATTACK | 0.986 | 0.980 | 0.986 |
Reference | ML Technique | F1 Score | FS Method | N Attr. | Comments |
---|---|---|---|---|---|
Sharafaldin et al. [25] | ID3 | 0.980 | RandomForest Regressor | 4 per class | Weighted evaluation metrics |
Engelen et al. [48] | RF | 0.990 | NA * | NA * | |
Rosay et al. [49] | RF | 0.999 | Not used | 74 | Dest. port included. Regeneration and labeling of flows |
Abdulhammed et al. [50] | RF | 0.988 | PCA | 10 | IP addresses and ports included |
Kurniabudi et al. [51] | RF | 0.998 | Information Gain ranker | 15 | Dest. port included |
Meemongkolkiat et al. [52] | Bagging ensemble | 0.999 | Recursive Feature Elimination & RF | 18 | Dest. and source ports included. 30 percent of the dataset |
Our work | PART/J48 | 0.999 | Not used | 77 | |
RF/J48 | 0.990 | CFS | 6 | ||
PART/J48 | 0.997 | Not used | 14 | Zeek-based flows and attributes |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Rodríguez, M.; Alesanco, Á.; Mehavilla, L.; García, J. Evaluation of Machine Learning Techniques for Traffic Flow-Based Intrusion Detection. Sensors 2022, 22, 9326. https://doi.org/10.3390/s22239326
Rodríguez M, Alesanco Á, Mehavilla L, García J. Evaluation of Machine Learning Techniques for Traffic Flow-Based Intrusion Detection. Sensors. 2022; 22(23):9326. https://doi.org/10.3390/s22239326
Chicago/Turabian StyleRodríguez, María, Álvaro Alesanco, Lorena Mehavilla, and José García. 2022. "Evaluation of Machine Learning Techniques for Traffic Flow-Based Intrusion Detection" Sensors 22, no. 23: 9326. https://doi.org/10.3390/s22239326
APA StyleRodríguez, M., Alesanco, Á., Mehavilla, L., & García, J. (2022). Evaluation of Machine Learning Techniques for Traffic Flow-Based Intrusion Detection. Sensors, 22(23), 9326. https://doi.org/10.3390/s22239326