Nothing Special   »   [go: up one dir, main page]

MJEE - Volume 17 - Issue 2 - Page 69-77

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Majlesi Journal of Electrical Engineering Vol. 17, No.

2, June 2023

Deep Learning Based Early Intrusion Detection in IIoT using


Honeypot
Abbasgholi Pashaei1, Mohammad Esmaeil Akbari2*, Mina Zolfy Lighvan3 , Asghar Charmin4
1,2,4- Department of Electrical Engineering, Ahar Branch, Islamic Azad University, Ahar, Iran.
Email: a-pashaei@iau-ahar.ac.ir, m-akbari@iau-ahar.ac.ir (Corresponding author), a_charmin@sut.ac.ir.
3- Department of Electrical and Computer Engineering Faculty, Tabriz University, Tabriz, Iran.
Email: mzolfy@tabrizu.ac.ir

Received: 27 September 2022 Revised: 19 October 2022 Accepted: 19 November 2022

ABSTRACT:
The increasing number of Industrial Internet of Things (IIoT) devices presents hackers with a huge attack surface from
which to conduct possibly more destructive assaults. Numerous of these assaults were successful as a consequence of
the hackers' inventive and unique approaches. Due to the unpredictability of network technology and attack attempts,
traditional Deep Learning (DL) approaches are made ineffective. The accuracy of DL algorithms has been shown
across a range of scientific fields. The Convolutional Neural Network Model (CNN) technique is an ideal alternative
for anomaly detection and classification since it can automatically classify incoming data and conduct calculations
faster. We introduce Honeypot Early Intrusion Detection System (HEIDS) that detects anomalies and classifies
intrusions in IIoT networks using DL methods. The model is designed to detect adversaries attempting to attack IIoT
Industrial Control Systems (ICS). The suggested model is implemented using One-dimensional convolutional neural
networks (CNN 1D). Due to the importance of industrial services, this system contributes to the enhancement of
information security detection in the industrial domain. Finally, this research gives an assessment of the HEIDS
datasets of IIoT, utilizing the CNN 1D technique. With this approach, the prediction accuracy of 1.0 was reached.

KEYWORDS: Industrial Internet of Things (IIoT), Honeypot Early Intrusion Detection System (HEIDS), IIoT
HEIDS, Network Security, Deep Learning (DL), One-dimensional Convolutional Neural Networks (CNN 1D).

1. INTRODUCTION links the physical and logical worlds of engineering and


Nowadays, an Industrial internet of things (IIoT) information technology. IIoT sensors collect data from
infrastructure, has distinct characteristics that influence physical components and transmit it to logical
industrial facilities design. The device, communication elements, while actuators respond to logical elements
channel, protocol, traffic volume, and quality of service by modifying physical components. IIoT ICS is a
requirements for each application may differ. IIoT closed system that places a premium on information
devices are making use of industrial protocols not interaction for the purpose of detecting and controlling
found in Information and communications technology the physical environment [5].
(ICT) or Internet of Things (IoT) environments [1]. In contrast to a corporate information technology
These industrial devices have a lifetime of decades and network, IIoT devices are often used in large-scale
must function under strict time constraints. These Industrial Control Networks. To ensure interoperability
industrial devices are in charge of vital infrastructures. and management, industries prefer to use homogeneous
Unlike IoT, IIoT applications often need continuous devices and protocols (e.g., Message Queue Telemetry
monitoring and control of physical processes [2]. Transport (MQTT), Data Distribution Service (DDS),
IIoT is the usage of the IoT to the automatization of Constrained Application Protocol (CoAP), and etc.)
industrial processes. mission-acute and non-exigent inside a single network. Additionally, industries handle
applications use these networks of devices to keep track device life cycles in-house (e.g., firmware upgrades or
of and control physical processes [3]. Oil and Gas vulnerability patches). Due to the transparency of
devices such as used Programmable Logic Controllers Industrial Control Networks and the predictability of
(PLCs), Remote Terminal Units (RTUs), and etc., are IIoT devices, basic security requirements may simply
examples of Industrial Control System (ICS) [4]. IIoT prohibit IIoT.

69
Paper type: Research paper
DOI:10.30486/mjee.2023.1970288.0
How to cite this paper: A. Gh. Pashaei, M. Esmaei Akbari, M. Zolfy Lighvan and A. Charmin, “Deep Learning Based Early
Intrusion Detection in IIoT using Honeypot”, Majlesi Journal of Electrical Engineering, Vol. 17, No. 2, pp. 69-77, 2023.
Majlesi Journal of Electrical Engineering Vol. 17, No. 2, June 2023

The IIoT is transforming the twenty-first century alghorithm for IIoT. To ease IIoT HEIDS development,
into a smart one. Actuators, sensors, and a variety of a new framework is necessary. This work adds to the
other industrial devices are being employed in a following contributions in this regard:
growing number of facilities worldwide. While this • IIoT HEIDS would benefit from DL methodology
facilitates connection and efficiency, IIoT devices each in order to provide more innovative analysis and
have their own set of resource limits, network classification to Early Detection (EL) systems.
constraints, and so on, all of which impact application • A novel CNN-based technique will be developed
security [6]. for the EL of recognized attacks in IIoT networks using
Due to the limited power, storage, compute, and DL methodologies.
communication capabilities available to IIoT devices, it • To identify abnormalities, assaults, and anomalies
is challenging to develop security mechanisms. in the simulated environment at the university
However, automation devices were not built with safety laboratory's Honeypot-based IIoT network, a novel
in mind and were first thought to be secure owing to approach was employed to gather and construct a new
their isolation. Stuxnet, and Flame, etc., assault all data set called the HEIDS Dataset, which contains
revealed this premise of security obscurity. As more information about the sorts of attacks and network
industrial devices connect to the internet, updates and traffic.
fixes are available over the internet become vital for • The CNN 1D algorithm will be used to analyze
decades-old industrial equipment [7]. the logs and recorded data from IIoT Honeypot systems
To safeguard IIoT settings, conventional security in order to establish the most optimum and quickest
methods like encryption, unified threat management, method for detecting network assaults.
antivirus, firewalls, intrusion detection system should • To accomplish EL of HEIDS, multi-classification
be utilized. However, they prevent security specialists utilizing CNN1D convolutional neural networks was
from seeing how attackers attack and studying their used.
behavior. Honeypots provide actionable information Because typical cyber security measures for IIoT
about the attackers, making them a feasible choice at devices are ineffectual, DL-enabled solutions are the
this point. A honeypot is a device that is intended to be new approach to safeguard IIoT devices. Among the
attacked and be compromised. Honeypots deceive several DL approaches available, this study used the
intruders into assuming they have physical access to CNN 1D algorithm on the IIoT's HEIDS dataset. This
systems [8]. Honeypots should be used in combination dataset has been made publicly available for research
with firewalls and intrusion detection systems to and testing purposes by the Azad University of Ahar.
improve system security and prevent further attacks [9]. The following parts have been structured in light of
Executing full-scale Artificial Intelligence, DL on this information: Section 2 discusses the Related
tiny devices is regarded as tough and complex. These Works. Section 3 defines the Proposed HEIDS
critical security objectives must be considered during Methodology. Section 4 presents Results and
the training and evaluation of the IIoT Honeypot Discussion. Conclusions are discussed in Section 5.
devices' DL models. These dangers need self-tuning
DL components and optimizing their hyperparameters 2. LITERATURE REVIEW
in the IIoT network. Due to barriers and practical Securing honeypots with DL applications in the
insights, the development of trustworthy DL techniques IIOT devices is a significant challenge. Numerous
for IIoT honeypot networks is still in its infancy. As a solutions have been presented in the literature for
result, the HEIDS-based DL model is proposed to be resolving this issue. In [10], the author used the
used to evaluate the trustworthiness and reliability of adaptive Honeypot alternative to amass information
IIoT networks. from attackers. RASSH, an adaptive honeypot based on
While there are a few Honeypot projects focused on a medium-interaction Kippo honeypot, was presented
the IIoT, there is no research in the literature that by the author in [11]. Furthermore, [12] presented a
evaluates the performance of DL algorithms integrated new SSH honeypot called Q Reinforced Adaptive SSH,
with Honeypot to expedite and enhance early intrusion which makes use of Cowrie and Deep Q-learning.
detection accuracy in industrial contexts. DL algorithm The authors of [13] presented IRASSH-T, a self-
connected with the Honeypot, analyses its parallels and adaptive IoT honeypot that is based on the QRASSH
distinctions, as well as excerpts crucial elements for the honeypot and focused on SSH/Telnet. In [14], the
design and deployment of IIoT honeypots. To address authors presented a new kind of Honeypot they call
this critical research need, we present our complete intelligent interaction, which mimics the actions of IoT
HEIDS model for IIoT contexts. To our knowledge, devices without presenting a danger to the Honeypot.
this is the first research inside the scholarly literature The authors of [15] centred on the usage of the Cowrie
that reviews the current state-of-the-art early intrusion Honeypot to detect IoT device assaults and built a
detection system using honeypot based on DL Cowrie that included ML capabilities. They discovered

70
Majlesi Journal of Electrical Engineering Vol. 17, No. 2, June 2023

that Support Vector Machine (SVM) delivers the most Dionaea of interaction and connection of the honeypots
accurate results, at 97.39 percent. The authors of [16] with the ICS network in order to balance detection
focused on data collection by emulating an IoT botnet accuracy and risk, as well as integrating the honeypot
system with Cowrie SSH/Telnet honeypots. Their detection feeds with an SDN framework to enable
solution optimizes performance by configuring the autonomic reconfiguration.
prefab command that produces correspond to real- Recent research on attack detection has mostly
world IoT gadgets and by using connections on ports overlooked IIoT's resource-constrained devices. In
that are sequence-matching. Additionally, they utilized applications that demand very intricate DL network
a clustering technique to discover that the most calculations for recognizing attack events, job
common way of obtaining or producing data is through allocation and concurrent processing of learning steps
botnet attacks on Telnet ports and that Mirai is used in are required. Malicious attacks on IIoT systems may
a huge number of attacks on IoT devices. The authors result in unmanageable data traffic, power-draining
of [17] demonstrated how to use a self-adaptive IIoT devices, resource consumption on the network,
honeypot based on ThingPot, detection of malware, and and data corruption. The IIoT devices were not
the identification of unknown malware, such as those designed with general-purpose honeypots in mind. We
employed in DDoS assaults. The proposed approach give a real-time technique for early detecting these
collects data about ThingPot honeypot attacks and uses risks (e.g., system logs, IP addresses, attack kinds and
it to build machine learning classifiers. characteristics, instructions performed and commands
DiPot, a distributed ICS honeypot, was presented by executed, and behavioral analysis and etc.) since they
the authors in [18]. They add to the Conpot framework must be identified in real-time, including processed
by simulating ICS protocols, collecting data, and doing data. We are the first to analyze the existing IIoT
analysis using K-means clustering. In [19], the authors HEIDS based CNN1D model and studies, to suggest an
presented the design and implementation of a Industrial ICS compatible with a honeypot, and to
NeuralPot strategy for adapting honeypot technology to describe the aforementioned innovative architecture.
the requirements of an industrial network. It is an As a consequence, it is critical to do research on the
interactive adaptation of the Conpot honeypot that effect of combining DL, a CNN1D algorithm with
generates network traffic when another network device based Honeypot intrusion detection systems. This study
is detected. The authors of [20] proposed a novel contributes to the performance of an IIoT HEIDS
architecture for developing adaptable honeypots with system when combined with a DL system in order to
HARM that makes use of SARSA or Q-Learning. increase its accuracy and computational speed.
Additionally, they demonstrated adaptation and agility
when confronted with a honeypot dataset obtained 3. PROPOSED HEIDS METHODOLOGY
through an SSH assault method. Numerous analyses Due to the IIoT HEIDS infrastructure, configuring
have identified honeypot technology as an attack vector of topological of devices, model training/testing steps
for malware. The design and operation of honeypots of the DL-based Honeypot Early Intrusion Detection
are based on earlier taxonomies designed to produce Method in IIoT presented in this study are shown in
large datasets over the duration of longitudinal Fig. 1.
deployments. This framework may be extended to
include honeypots that detect various forms of assaults.
In [21], the authors advocated combining Conpot and

71
Majlesi Journal of Electrical Engineering Vol. 17, No. 2, June 2023

IIoT Honeypot EIDS

IIoT HEIDS HMI SCADA

IIoT Conpot IIoT HoneyPLC IIoT Honeypot IIoT Honeypot IIoT Cowrie

Router

PLC S7- PLC S7- PLC S7-


200
1
PLC S7- SNMP
300 1200
3 2

1200
5 4

7 6

Switch Industrial

Auto
Compressor Power Supply Lab Tem.
Valve Motor
Fig. 1. The overall scheme of the proposed integrated IIoT Honeypot EIDS architecture .

3.1. Packet-Based Detection on EIDS model utilized as input. To maintain the packets' length, fill or
Flow-based detection outperforms packet-based intercept the packets' input units. The complement 0
detection in experiments. However, flow-based operations are done if the multipacket unit's byte length
detection has flaws. During data prep, it must pad the s is less than or equal to S; otherwise, the last (s-S)
short flow or intercept excess bytes in the long flow. bytes are intercepted. Padding or intercepting ensures
Certain flows include bytes in excess of the that all input units have a length of S bytes. Prior to
interception threshold, resulting in severe data loss and training, the data is compressed to eliminate the
reduced detection accuracy. The attacker may also have computational inefficiencies associated with utilizing
successfully entered before the forecast is made since raw traffic data as input. In this study, min-max input
many flows include a large number of packets. scaling is utilized to direct networking attention to the
The packet-based method distinguishes between component input containing the greatest range. The
fine and coarse-grained packets as data units. These M normalization function's scaler [22] with a minimum-
packets (source/destination IP address, to-maximum range is defined as follows:
source/destination port, protocol, and etc.) are utilized x  xmin
in the data preparation process. Additionally, the M xn  (1)
value has an effect on the accuracy of detection and xmax  xmin
calculation efficiency. Input units are data packets that Where xn is the data vector that has been scaled,
contain identical tuples that are received during a
waiting time (less than M packets). Apart from the IP xmin and xmax are the data vector that has been
address and port, the IP data included in M packets are supplied, and x is the data vector's various properties.

72
Majlesi Journal of Electrical Engineering Vol. 17, No. 2, June 2023

3.2. DL CNNs model


The term CNN refers to a subset of Feed-Forward

klihi  tanh whi xi:i  f 1  b  (2)
Neural Networks (FFNs). This indicates that CNN uses When a bias term is denoted by b. Each set of
one-dimensional time series data with well-defined features ƒ in a data connection is subjected to the filter
time intervals. We use CNN algorithms to describe kl.
network traffic events as time series data spanning both The filter kl is applied to each set of features ƒ
benign and malicious connections. The CNN1D included in a data connection record {x1:ƒ, x2:ƒ+1,..., xn-
method typically consists of five layers, which include ƒ+1 } in order to produce a feature map.
convolution, pooling, completely connected, and non-
linear activation as ReLU. Each filter is a learnt kl  kl1 , kl2 ,...kln f 1  (3)
weighted vector. It starts with convolutional and
maximum-pooling layers, continues with sparsely or
completely linked layers, and finishes with a decision Where kl and the max-pooling operation is applied
or classification layer. This deep CNN is capable of to each feature map as kl = maxh{kl}. This returns the
managing changes in the input data that are both most important characteristics in which the highest-
modest and significant, through the use of supervised valued feature is chosen. However, multiple features
learning. A CNN outperforms other neural networks– acquire multiple features, which are then supplied to
based feature extraction methods due to the tiny the fully linked layer. A layer that is completely linked
weights of the CNN. A deep CNN is capable of includes the softmax function, which computes the
recognizing patterns or objects in one-dimensional (1D) probability distribution over each class [25]. A layer
input. In comparison to traditional neural networks, that is completely linked is described mathematically as
CNNs have a distinct architecture. In a CNN, each
layer is composed of a group of neurons that are linked
to the preceding layers. By contrast, each layer of a

ot  SoftMax who kl  bo  (4)

CNN is only partially linked to the neurons of the Convolution 1D generates a feature map ℎ𝑖 from
preceding layer. A hidden layer is constructed by input data by convolution with a filter 𝑤𝑖 . The
combining one or more CNN layers with an FFN dense following equation is used to express the convolution
layer. CNN gets data in the form 30 1 from an input process (5).
layer. A CNN generates a Keras of type 30  3  64
(The number 64 refers to the number of filters) and hi  f  hi 1  wi  bi  (5)
passes it to the max-pooling layer. It decreases the
formation of the Keras to 13  3  64 . Using the FFN Following the convolution layer, a pooling layer is
dense layer, this tensor may be used to identify objects added to reduce the size of the feature map hij. The
or to record temporal patterns. mathematical equation for the pooling layer is written
Where ℎ𝑖 is the feature map at layer i and ℎ0 = 𝑥 as follows (6):
denotes the input layer, 𝑤𝑖 denotes the weight vector of
the convolution filter at layer i and 𝑏𝑖 and signify the fhi  pool  f  fhi 1   (6)
bias vector and activation function, respectively. The
rectified linear unit (ReLU) activation function is a
4. RESULTS AND DISCUSSION
frequently utilized non-linear function in CNNs [23].
The data required to verify the CNN 1D model
By sharing the weight and bias vectors, the dominant
should be easily accessible and accurately reflect the
CNN uses fewer parameters than a standard neural
behavior of the host or network. Consider how time-
network. Additionally, it does not need hand-crafted
consuming and difficult it is to create a dataset. As a
feature extraction, as is the case with traditional ML
consequence, using a benchmark dataset speeds up the
classifiers. The pooling layer down samples the feature
diagnostic procedure. Because of the validity of the
map to reduce its dimensionality.
benchmark data sets, they allow the creation and
The convolutional 1D layer receives the network
extraction of more appealing experimental findings in
traffic data packets as an input vector of size 𝑥 =
laboratory research, as well as the comparison of the
(𝑥1 , 𝑥2 , 𝑠3 , , , 𝑥𝑘−1 , 𝑥29 , 𝑐𝑙) where 𝑥𝑘 signifies features,
proposed method's outcomes to those from earlier
and cl denotes the class label for the dataset.
research. To identify the most efficient and best
Convolution 1D constructs a feature map ℎ𝑖 by detection model possible for the Honeypot's stored
convolutioning the input data with a filter w, where ƒ
data, HEIDS logs were used in the laboratory to verify
denotes the features in Data packets [24]. As a result of the study's results and accuracy. The CIC-IDS 2017
a collection of features ƒ, a new feature map hi is dataset, the NSL-KDD dataset, the Kyoto 2006 dataset,
obtained as and the CICDoS2019 dataset are all used in this study.

73
Majlesi Journal of Electrical Engineering Vol. 17, No. 2, June 2023

This is a mixed data collection comprised of a huge in this paper's suggested design to generate the most
number of network traffic and system logs containing appropriate model for assessing data attributes are
data, of which the daily data is a portion. The data required. As a result, the following criteria are briefly
collection contains a variety of attack types and discussed in this article, along with the relevant
subtypes, including brute force assaults, distributed formulas and equations.
denial of service attacks, surveillance network attacks, (TP  TN )
and penetration attacks. However, there are just a few Accuracy 
ways for DL Honeypot intrusion detection in this data (TP  TN  FP  FN )
set. TP
R
4.1. HEIDS Dataset TP  FN (7)
The intrusion detection system completes the TP
HEIDS collection when industrial network logs are P
produced. The proposed HEIDS dataset is a tiny but
TP  FP
useful tool that allows the system to rapidly suspicious P R
F1  2 
network traffic detection. As a consequence, practically PR
every menace that traverses the network may be
detected via the application of adaptable and robust 4.3. Experimental Comparison
rules. To achieve the aforementioned objectives, a To begin, the performance of the model for DL
solution is needed for processing the alert data from detection is examined. Then, using CNN1D as the
this enormous dataset. As a consequence, the CSV detection model, the impacts of the four metrics
format is used to analyze alert data since it is the most approaches discussed in this research are compared on
adaptable and appropriate format for data collection. detection accuracy and computing efficiency. The
Table 1, includes a CSV log file named alert.csv in the experiment's hardware environment is comprised of an
configuration log's default output, along with 30 Intel(R) Core (TM) i5-2450M CPU running at
features. 2.50GHz and 8 GB of RAM. The suggested HEIDS is
compared to the model CNN1D with excellent
Table 1. Feature generation for the IIoT heids dataset. performance in the present research using the common
Feature Feature Feature network and data sets CIC-IDS2017, NSL-KDD, Kyoto
time icmpseq icmpid 2006, and CICDoS2019. The detection precision (P),
icmpcode date sig_generator accuracy, recall (R), and F1-score of the four models
icmptype iplen dgmlen are compared and examined via tests. CNN1D employs
id tos ttl the same optimum network topology and parameters as
the CNN1D model described in this study on the
tcpwindow tcpln tcpack
HEIDS dataset. All incoming and outgoing traffic that
tcpseq tcpflags ethlen interacts with the IIoT Honeypot sensors is logged in
ethdst ethsrc dstport MySQL, the operating system's dataset management
dst srcport src system. The HEIDS dataset was built to ensure the
proto msg sig_rev correctness of the operation and its capacity to identify
sig_id timestamp log incursions. To ensure that the results are accurate, it
is required to compare the assessment findings for this
HEIDS are used to detect ICS attacks. Thus, dataset to those from conventional evaluation
experiments employing DL techniques on large-scale techniques.
datasets have been conducted. The CIC-IDS 2017 The proposed IIoT HEIDS is modeled in such a
dataset, the NSL-KDD dataset, the Kyoto 2006 dataset, way that it encompasses all available ICSs, including
and the CICDoS2019 dataset are all well-known radio frequency systems, PLCs, pressure sensors,
datasets. The proposed research, however, does not position valves, and different actuators, such as control
make advantage of existing cyberattack trends. As a valves and electric motors. In a real-world industrial
result, the HEIDS dataset from the study is enhanced setting, the suggested system must identify attackers'
with the most recent Log. Additionally, the new dataset assaults in real-time. As a result, design and simulation
is analyzed using DL methods and compared to work had to be performed concurrently for multiple
previous classification findings. distinct regions. As a result, the incoming and outgoing
traffic logs for these systems were gathered in existing
4.2. Metrics facilities, which are realistically distributed over many
Methods and criteria for appraising for precision networks using different protocols, and stored in a
(P), accuracy, recall (R), and F1-score employed by DL comprehensive dataset known as the HEIDS system

74
Majlesi Journal of Electrical Engineering Vol. 17, No. 2, June 2023

dataset. using the model suggested in this study for the EIDS
The HEIDS dataset must be properly labeled, to the dataset. The output of CNN1D algorithms is seen in
extent feasible, by comparing simulation findings from Fig. 2. The accuracy requirements suggest that the
other datasets described in this area to those from this CNN1D algorithms from EIDS outperform other
research. The IIoT HEIDS research makes available datasets from DL, as demonstrated by the results of the
complete data labeling, correctness, and scientific work study.
procedures. Numerous ways and tactics for obtaining Additionally, in Table 3 and 4, the improvement
accurate data have been examined in this study. rate of the obtained results for detecting traffic
The suggested IIoT HEIDS system's EL anomalies using the proposed HEIDS dataset is
performance utilizing stored logs in the dataset was expressed in percentage terms for the two essential
compared to the performance of other large datasets in criteria, accuracy and F1-Score, respectively when
the globe, and their findings were acquired and studied, compared to the four datasets mentioned in this
processed, and evaluated, and the findings from the DL research. This enhancement is crucial for HEIDS since
algorithm for IIoT HEIDS suggest that the technique is the approach developed for this research is capable of
successful in detecting early. Additionally, HEIDS was detecting the incursion of aberrant traffic and
done on all datasets in the same manner as it was on the demonstrating great accuracy. As a result, this design is
other datasets. very reliable for usage in industrial facilities.
A confusion matrix is a table that is often used to
4.4. Performance Evaluation of DL Detection explain the performance of a classification model on a
As a result, the five datasets were evaluated and set of test data whose true values are known. A
processed independently. The findings of individual Receiver Operating Characteristic (ROC) curve is a
analyses of each item based on the provided graphical depiction of the diagnostic performance of a
measurements are collected and presented in this part in binary classifier system when its discrimination
the form of tables and diagrams with explanations. threshold is changed. As a result, Fig. 3 illustrates
Explanation of these analyses, procedures, graphs, and receiver operating characteristic curves in (a, c, e, g, i)
findings was accomplished by the use of a specialized and a confusion matrix in (b, d, f, h, k) was utilized to
application created in the Python programming detect traffic anomalies using the Python simulation
language for this project. tool for the NSL-KDD, CIC-IDS2017, Kyoto 2006,
Table 2 contains information on the datasets, CICDoS2019, and IIoT HEIDS dataset.
processing, and analysis of the findings acquired from
the developed software. Table 2 summarizes the CNN1D
computed results for identifying anomalous traffic
using the CNN1D algorithm. Table 2 illustrates the 1.200
acquired results from the CNN1D algorithm in Table 2
as a bar chart. According to the accuracy criteria, the 1.000
EIDS Dataset outperforms other datasets.
0.800
Table 2. Results for anomalies traffic detection for the
0.600
cnn1d algorithms used in the research for the five
datasets.
0.400
Accurac Precisio
Dataset Recall F1
y n
0.200
NSL-KDD 0.770 0.652 0.922 0.764
CICIDS2017 0.995 0.863 0.808 0.835 0.000
Kyoto2006+ 0.9794 0.9514 0.9992 0.9747
CICDoS201
0.9999 0.9975 0.9809 0.9892
9
HEIDS 1.000 1.000 1.000 1.000
Accuracy Recall Precision F1
Another criterion is the F1-Score criterion, which is
Fig. 2. Using the Python software, we determined the
a mixture of the R and P criteria, and here, as with the
accuracy, R, P, and F1-Score for detecting traffic
R and P criteria, the HEIDS Datasets perform better, as
anomalies in the algorithms used for the CNND1D at
seen in Fig. 2. In the HEIDS dataset, the CNN1D
NSL-KDD, CIC-IDS2017, Kyoto 2006, cicdos2019,
method was used, which has high accuracy and F1-
and HEIDS dataset.
Score. Finally, Table 2 summarizes the results obtained

75
Majlesi Journal of Electrical Engineering Vol. 17, No. 2, June 2023

Table 3. In comparison to the other four datasets Table 4. Comparing the improvement rate of the
mentioned in this research, the proposed heids dataset obtained results for the f1-score to the other four
shows the greatest improvement in accuracy datasets mentioned in this research.
(percentage) for detecting traffic anomalies. Dataset CIC-
NSL-KDD Kyoto2006 CICDoS2019
Dataset CIC- Method IDS2017
NSL-KDD Kyoto2006 CICDoS2019
Method IDS2017
CNN1D 30.92% 19.82% 2.59% 1.07%
CNN1D 29.80% 0.46% 2.10% 0.07%

(a) NSL-KDD (c) CIC-IDS2017 (e) Kyoto 2006 (g) CICDoS2019 (j) IIoT-HEIDS

(b) NSL-KDD (d) CIC-IDS2017 (f) Kyoto 2006 (h) CIC-DoS2019 (k) IIoT-HEIDS

Fig. 3. Confusion matrix and receiver operating characteristic curve measurements were made to identify traffic
abnormalities using the Python simulation tool for the methods used in the NSL-KDD, CIC-IDS2017, and Kyoto 2006
datasets, as well as the CICDoS2019 and IIoT HEIDS datasets.

5. CONCLUSION CNN1D. Finally, the accuracy of HEIDS on the


This paper aimed to investigate the effect of primary dataset rose by 1.07 percent as compared to
combining the deep learning algorithm CNN1D with CIC-DoS2019 in test data CNN1D.
the IIoT HEIDS sensors in industrial devices and According to the collected findings, when compared
environments. To build a HEIDS dataset, Port Scanner to other datasets, the tool built for this study greatly
attacks, DDoS attacks, and etc. were simulated on enhanced the analysis of the IIoT HEIDS dataset for
operating IIoT HEIDS sensors, as well as other critical early intrusion detection. The completed design, with
tools and equipment. It offers a unique solution based great accuracy, is capable of detecting abnormal traffic
on Honeypot and a combination of deep learning in industrial facilities through its increased sensor
algorithm for modeling and forecasting that enables the network. As a result, it is an efficient and
detection and categorization of characteristics such as comprehensive cybersecurity system capable of
normal and abnormal (suspicious) data. defending against future assaults and zero-day exploits
Numerous datasets, including NSL-KDD, CIC- in industrial facilities.
IDS2017, Kyoto 2006, and CIC-DoS2019, were
utilized to develop a comprehensive strategy for REFERENCES
industrial network categorization, and a dataset was [1] H. Wang, W. Zhang, H. He, P. Liu, D. X. Luo, Y. Liu,
developed using the best characteristics. Finally, the J. Jiang, Y. Li, X. Zhang, and W. Liu, “An
accuracy index was tested using four reference datasets evolutionary study of IoT malware,” IEEE Internet
and a dataset in the proposed approach HEIDS in DL- of Things Journal, Vol. 8, No. 20, pp. 15422-15440,
2021.
CNN1D in a fully equipped IIoT HEIDS laboratory. [2] W. Zhang, B. Zhang, Y. Zhou, H. He, and Z. Ding,
HEIDS's accuracy has risen in comparison to the four “An IoT honeynet based on multiport honeypots
previously stated datasets. In test data CNN1D, the for capturing IoT attacks,” IEEE Internet of Things
accuracy of HEIDS on the primary dataset rose by Journal, Vol. 7, No. 5, pp. 3991-3999, 2019.
29.80 percent as compared to NSL-KDD. In test data [3] J. Franco, A. Aris, B. Canberk, and A. S. Uluagac, “A
CNN1D, the accuracy of HEIDS on the primary dataset survey of honeypots and honeynets for internet of
rose by 0.46 percent when compared to CIC-IDS2017. things, industrial internet of things, and cyber-
In comparison to Kyoto 2006, the accuracy of HEIDS physical systems,” IEEE Communications Surveys &
on the primary dataset rose by 2.10 percent in test data Tutorials, Vol. 23, No. 4, pp. 2351-2383, 2021.

76
Majlesi Journal of Electrical Engineering Vol. 17, No. 2, June 2023

[4] A. Pashaei, M. E. Akbari, M. Z. Lighvan, and [15] Y. Zhou, "Chameleon: Towards adaptive honeypot
Charmin, A, "Early Intrusion Detection System for internet of things." in Proceedings of the ACM
using honeypot for industrial control networks," Turing Celebration Conference-China, pp. 1-5, 2019.
Results in Engineering, 16, 100576, 2022. [16] B. Lingenfelter, I. Vakilinia, and S. Sengupta,
[5] W. Tian, M. Du, X. Ji, G. Liu, Y. Dai, and Z. Han, "Analyzing variation among IoT botnets using
“Honeypot detection strategy against advanced medium interaction honeypots," in 2020 10th
persistent threats in industrial internet of things: a Annual Computing and Communication Workshop
prospect theoretic game,” IEEE Internet of Things and Conference (CCWC): IEEE, pp. 0761-0767, 2020.
Journal, Vol. 8, No. 24, pp. 17372-17381, 2021. [17] M. Wang, J. Santillan, and F. Kuipers, “Thingpot: an
[6] O. Tsemogne, Y. Hayel, C. Kamhoua, and G. interactive internet-of-things honeypot,” arXiv
Deugoué, “Game-Theoretic Modeling of Cyber preprint arXiv:1807.04114, 2018.
Deception Against Epidemic Botnets in Internet of [18] B. Lingenfelter, I. Vakilinia, and S. Sengupta,
Things,” IEEE Internet of Things Journal, Vol. 9, No. "Analyzing variation among IoT botnets using
4, pp. 2678-2687, 2021. medium interaction honeypots." in Proceedings of
[7] Q. Li, X. Feng, H. Wang, and L. Sun, the ACM Turing Celebration Conference-China, pp.
“Understanding the usage of industrial control 0761-0767, 2019.
system devices on the internet,” IEEE Internet of [19] I. Siniosoglou, G. Efstathopoulos, D. Pliatsios, I. D.
Things Journal, Vol. 5, No. 3, pp. 2178-2189, 2018. Moscholios, A. Sarigiannidis, G. Sakellari, G. Loukas,
[8] A. Pashaei, M. E. Akbari, M. Z. Lighvan, and A. and P. Sarigiannidis, "NeuralPot: An industrial
Charmin, “Honeypot Intrusion Detection System honeypot implementation based on deep neural
using an Adversarial Reinforcement Learning for networks." in 2020 IEEE Symposium on Computers
Industrial Control Networks,” Majlesi Journal of and Communications (ISCC): IEEE, pp. 1-7, 2020.
Telecommunication Devices, Vol.12(1), pp. 17-28, [20] S. Dowling, M. Schukat, and E. Barrett, “New
2023. framework for adaptive and agile honeypots,” ETRI
[9] A. Pashaei, M. E. Akbari, M. Z. Lighvan, and A. Journal, Vol. 42, No. 6, pp. 965-975, 2020.
Charmin, “A Honeypot-assisted Industrial Control [21] S. Maesschalck, V. Giotsas, B. Green, and N. Race,
System to Detect Replication Attacks on Wireless "Honeypots for Automatic Network-Level
Sensor Networks,” Majlesi Journal of Industrial Control System Security." in 14th
Telecommunication Devices, Vol. 11(3), pp. 155-160, EuroSys Doctoral Workshop, 2020.
2022. [22] Y. Wang, Y. Jiang, and J. Lan, “Fcnn: An efficient
[10] G. Wagener, “Self-adaptive honeypots coercing and intrusion detection method based on raw network
assessing attacker behaviour,” Institut National traffic,” Security and Communication Networks, Vol.
Polytechnique de Lorraine-INPL, 2011. 2021, 2021.
[11] A. Pauna, and I. Bica, "RASSH-Reinforced adaptive [23] R. Vinayakumar, K. Soman, and P. Poornachandran,
SSH honeypot." In 2014 10th International "Applying convolutional neural network for
Conference on Communications (COMM), IEEE. Vol., network intrusion detection." in 2017 International
No. Issue, pp. 1-6, 2014. Conference on Advances in Computing,
[12] A. Pauna, A.-C. Iacob, and I. Bica, "Qrassh-a self- Communications and Informatics (ICACCI): IEEE,
adaptive ssh honeypot driven by q-learning." In pp. 1222-1228, 2017.
2018 international conference on communications [24] G. Swapna, K. Soman, and R. Vinayakumar,
(COMM), IEEE, pp. 441-446, 2018. “Automated detection of cardiac arrhythmia using
[13] A. Pauna, I. Bica, F. Pop, and A. Castiglione, “On the deep learning techniques,” Procedia computer
rewards of self-adaptive IoT honeypots,” Annals of science, Vol. 132, pp. 1192-1201, 2018.
Telecommunications, Vol. 74, No. 7, pp. 501-515, [25] A. K. Verma, P. Kaushik, and G. Shrivastava, "A
2019. network intrusion detection approach using variant
[14] T. Luo, Z. Xu, X. Jin, Y. Jia, and X. Ouyang, of convolution neural network." in 2019
“Iotcandyjar: Towards an intelligent-interaction International Conference on Communication and
honeypot for iot devices,” Black Hat, Vol. 1, pp. 1- Electronics Systems (ICCES): IEEE, pp. 409-416,
11, 2017. 2019.

77

You might also like