1. Introduction
In recent years, with the rapid development of cloud computing, LoRa, NB-IoT, 5G communication and artificial intelligence technologies, the internet of things (IoT) technology has also ushered in a boom-like development, and hundreds of millions of devices are connected to the Internet of Things. However, because many IoT nodes collect and store large amounts of user privacy data, IoT systems have become an ideal target for cyber attackers, and attacks on the Internet of Things are increasing [
1,
2]. Gemalto’s IoT security report shows that more than half of companies still can’t find out whether they have suffered IoT vulnerability attacks. In addition, the report surveyed 950 IT and business decision makers and found that only 59% of companies encrypted all IoT-related data [
3]. The popularity of IoT technology and the intelligence of devices have brought great convenience to people, but the use of new technologies and intelligent devices has also brought new security and privacy risks. For example, on 29 January 2018, the top three banks (ABN AMRO, ING Bank, Rabobank) in the Netherlands were attacked by distributed denial of service (DDoS), blocking access to websites and internet banking services [
4]. In February 2018, the Pyeongchang Winter Olympics in South Korea suffered a cyber attack, which caused the live broadcast to be interrupted [
5]. Therefore, maintaining the security of the IoT system is becoming the focus of successful deployment of the IoT network, and detecting intruders is an important step to ensure the security of the IoT network. Intrusion detection is one of several security mechanisms to manage security intrusions [
6]. It monitors network traffic for abnormal or suspicious activity and issues alerts when such activity is discovered. Intrusion detection system (IDS) can be classified into host-based intrusion detection systems (HIDS) and network-based intrusion detection system (NIDS). In this paper, we study a network-based intrusion detection system. We studied the use of generation models and deep learning techniques to build intrusion detection classifiers to detect a great variety of attacks, such as DoS (denial of service), probe, U2R (user to root), R2L (remote to local), worms, shellcode, backdoor, reconnaissance, generic attacks, etc.
Many researchers have introduced more and more innovative approaches to detect intrusions in recent years, including anomaly detection methods, shallow learning methods, deep learning methods, and ensemble methods. The anomaly detection methods calculate the distribution of normal network data and define any data that deviates from the normal distribution as an anomaly, such as Bayesian models [
7,
8], the Cluster algorithms (K-Means, spectral clustering, DBSCAN, etc.) [
7], self-organizing map (SOM) [
9], Gaussian mixture model (GMM) [
10], and one-class SVM [
11]. Shallow learning methods use the selected features to build a classifier to detect intrusions, such as support vector machines (SVM) [
12], decision tree (DT) [
13], and k-nearest neighbor (KNN) [
14]. Deep learning methods can automatically extract features and perform classification, such as AutoEncoder [
15,
16], deep neural network (DNN) [
17], deep belief network (DBN) [
18,
19,
20,
21], and recurrent neural network (RNN) [
22]. The last category uses various ensemble and hybrid techniques to improve detection performance, including bagging [
23], boosting [
24], stacking [
25], and combined classifier methods [
26].
Deep learning is a data representation and learning method based on machine learning, which has become a hot research topic. It can automatically extract high-level latent features without manual intervention [
27]. Deep learning is widely applied in many fields of artificial intelligence, including speech processing, computer vision, natural language processing and so on. Moreover, deep learning has been applied to network security detection [
28,
29]. However, there are still many problems with intrusion detection systems. First, different types of network traffic in a real network environment are imbalanced, and network intrusion records are less than normal records. The classifier is biased towards the more frequently occurring records, which reduces the detection rate of minority attacks such as R2L and worms attacks. Second, because of the high dimension of network traffic, the feature selection method in many intrusion models is first considered as one of the pre-processing steps [
30], such as principal component analysis (PCA) and chi-square feature selection. However, these feature selection methods rely heavily on manual feature extraction, mainly through experience and luck, and these algorithms are not effective enough. Third, due to the large network traffic and complex structure, the traditional classifier algorithm is difficult to achieve high detection rate. Fourth, the network operating environment and structure in the real world are changing, for example, the Internet of Things and cloud services are widely used and various new attacks are emerging. Since many unknown attacks do not appear in the training dataset, traditional intrusion detection methods usually perform poorly in detecting unknown attacks.
Taking into account the above factors, we propose a novel intrusion detection method called ICVAE-DNN, which combines improved conditional variational AutoEncoder (ICVAE) with DNN. The variational AutoEncoder (VAE) is an important generation model consisting of an encoder (a recognition network) and a decoder (a generator network) that use deep neural networks to characterize the distribution of data and latent variables, which was proposed by Kingma et al. [
31] in 2013. VAE can generate samples, but it is not possible to generate some specific samples based on the labels. Therefore, CVAE was developed by Kingma et al. [
32] in 2014. The CVAE is an extension of VAE [
33]. It embeds a one-hot encoded label vector in the encoder and decoder, and converts unsupervised training mode into supervised training mode. CVAE not only automatically extracts high-level features and reduces the dimensions of network features, but also generate new attack samples of the specified categories. In order to initialize the weight of the DNN hidden layers using the CVAE encoder, we have improved CVAE by embedding intrusion tags only in the decoder, but not in the encoder, named ICVAE.
This paper has the following main contributions. First, we use ICVAE to learn the distribution of complex traffic and classes through supervised learning. The network parameters of ICVAE encoder are used to initialize the weight of DNN hidden layers. Second, latent variables with Gaussian noise and specified labels are fed into the trained ICVAE decoder (generating network) to generate specific new attack records, so as to balance the training data and increase the diversity of training samples, thus improving the detection rate of minority attacks and unknown attacks. Third, DNN is used to automatically extract high-level features, and adjust network weights by back propagation and fine-tuning to better address the classification problem of complex, large-scale and non-linear network traffic. Finally, the proposed model is evaluated on the NSL-KDD [
34,
35] and UNSW-NB15 [
36,
37,
38] datasets. Compared with the well-known classification methods, the proposed model not only reaches better overall accuracy, recall, and false positive rate, but also achieves higher detection rate in minority attacks and unknown attacks.
The remainder of this paper is organized as follows. The related works are introduced in
Section 2.
Section 3 describes the ICVAE and DNN algorithms.
Section 4 proposes a novel intrusion detection model and shows in detail how the model works.
Section 5 demonstrates the experimental details and results. Finally,
Section 6 provides some conclusions and further work.
2. Related Works
Although there are CVAE-related work in other fields, there is no report on the combination of ICVAE and DNN for intrusion detection. Kawachi et al. [
39] employed a VAE for supervised anomaly detection. Sun et al. [
40] used a VAE to learn sparse representations for anomaly detection. Chandy et al. [
41] used VAE as a deep generation model to simulate network attack detection problems. Osada et al. [
42] employed VAE as a semi-supervised learning for intrusion detection. They use VAE to detect intrusions, not CVAE. Lopez–Martin et al. [
16] used conditional VAE (CVAE) to build an ID-CVAE classifier to perform classification and feature recovery. The ID-CVAE uses the reconstructed test data and the nearest neighbor method based on the Euclidean distance to classify the test samples. However, our proposed model not only generates data according to categories, but also uses DNN classifier to perform classification.
The deep learning method integrates high-level feature extraction and classification tasks, overcomes some limitations of shallow learning, and further promotes the progress of intrusion detection systems. Recently, deep learning models have been widely used in the field of intrusion detection. Stacked AutoEncoders are used to detect attacks in IEEE 802.11 networks with an overall accuracy of 98.60% [
43]. Ma et al. [
44] presented a hybrid method combining spectral clustering and deep neural networks to detect attacks with an overall accuracy of 72.64% on the NSL-KDD dataset. The gated recurrent unit recurrent neural network (GRU-RNN) was used to build an intrusion detection system in an software defined network (SDN) with an accuracy of 89% [
45]. Shone et al. [
15] employed a stacked non-symmetric AutoEncoder and random forest (RF) to detect attacks. Muna et al. [
46] proposed an anomaly detection technique for internet industrial control systems (IICSs) based on the deep learning model, which used deep auto-encoder for feature extraction and deep feedforward neural network for classification. Tamer et al. [
20] employed the restricted Boltzmann machine (RBM) to classify normal and abnormal network traffic. Imamverdiyev [
18] used the multilayer deep Gaussian–Bernoulli RBM method to detect DoS attacks with an accuracy of 73.23% on the NSL-KDD dataset.
The above intrusion detection evaluation results are very encouraging, but these classification techniques still have detection defects, low detection rate for unknown attacks and high false positive rate for minority attacks. In order to overcome these classification problems, this paper uses ICVAE decoder to generate new attack samples according to the specified intrusion categories, thereby improving the detection rate of unknown attacks and minority attacks. ICVAE encoder automatically learns the potential representation of input data and reduces the dimensions of features. Furthermore, the ICVAE encoder is used to initialize the weight of DNN hidden layers. Finally, it is easier for DNN to achieve global optimization by back propagation and fine tuning network parameters.
6. Conclusions
In this paper, we propose a novel intrusion detection approach called ICVAE-DNN that combines the ICVAE with DNN. For large data sets, ICVAE can learn and explore the potential sparse representations between network data features and categories. The trained ICVAE encoder is used to initialize the weight of DNN hidden layers. DNN can learn faster and easier than traditional multi-layer perceptron networks, thus avoiding stopping in the local minima. The ICVAE decoder is able to generate various unknown attack samples according to the specified intrusion categories, which not only balances the training data set, but also increases the diversity of training samples, so ICVAE-DNN can improve the detection rate of minority attacks and unknown attacks. DNN can automatically extract high-level abstract features from the training data, thus it can reduce data dimension to avoid dimension curse. DNN integrates feature extraction and classification methods into a system that automatically extracts features and performs classification without a lot of heuristic rules and manual experience. The classification performance of ICVAE-DNN is evaluated on the NSL-KDD (KDDTest+), NSL-KDD (KDDTest-21), and UNSW-NB15 datasets and compared with six well-known classifiers. Moreover, the experimental results show that the proposed ICVAE-DNN provides higher detection rates in minority attacks (i.e., U2R, R2L, shellcode and worms) than the six well-known classification algorithms: KNN, MultinomialNB, RF, SVM, DNN and DBN. In addition, compared with the state-of-the-art classifiers (such as SCDNN, STL, DNN, Gaussian–Bernoulli RBM, RNN-IDS, ID-CVAE, CASCADE-ANN, EM Clustering and DT), the proposed ICVAE-DNN achieves higher accuracy, detection rate and false positive rate. These experiments prove that ICVAE-DNN is more suitable for detecting network intrusion, especially for minority attacks and unknown attacks.
Considering future work, we plan to study an effective way to improve the detection performance of minority attacks and unknown attacks. We plan to use the adversarial learning method to explore the spatial distribution of ICVAE latent variables to better reconstruct input samples. Through the adversarial learning method, similar minority attacks can be synthesized, and the diversity of training samples can be increased. As a result, the detection performance of the ICVAE-DNN can be further improved.