research-article

Open access

Deep PackGen: A Deep Reinforcement Learning Framework for Adversarial Network Packet Generation

Authors:

Nathaniel BastianAuthors Info & Claims

ACM Transactions on Privacy and Security, Volume 28, Issue 2

Article No.: 15, Pages 1 - 33

https://doi.org/10.1145/3712307

Published: 22 February 2025 Publication History

PDF eReader

Abstract

Recent advancements in artificial intelligence (AI) and machine learning (ML) algorithms, coupled with the availability of faster computing infrastructure, have enhanced the security posture of cybersecurity operations centers (defenders) through the development of ML-aided network intrusion detection systems (NIDS). Concurrently, the abilities of adversaries to evade security have also increased with the support of AI/ML models. Therefore, defenders need to proactively prepare for evasion attacks that exploit the detection mechanisms of NIDS. Recent studies have found that the perturbation of flow-based and packet-based features can deceive ML models, but these approaches have limitations. Perturbations made to the flow-based features are difficult to reverse-engineer, while samples generated with perturbations to the packet-based features are not playable.

Our methodological framework, Deep PackGen, employs deep reinforcement learning to generate adversarial packets and aims to overcome the limitations of approaches in the literature. By taking raw malicious network packets as inputs and systematically making perturbations on them, Deep PackGen camouflages them as benign packets while still maintaining their functionality. In our experiments, using publicly available data, Deep PackGen achieved an average adversarial success rate of 66.4% against various ML models and across different attack types. Our investigation also revealed that more than 45% of the successful adversarial samples were out-of-distribution packets that evaded the decision boundaries of the classifiers. The knowledge gained from our study on the adversary’s ability to make specific evasive perturbations to different types of malicious packets can help defenders enhance the robustness of their NIDS against evolving adversarial attacks.

1 Introduction

A network intrusion detection system (NIDS) is a primary tool for cybersecurity operations centers (CSOCs) to detect cyber-attacks on computer networks. With the availability of high-performance computing resources and advancements in artificial intelligence (AI) and machine learning (ML) algorithms, intrusion detection mechanisms have greatly improved, serving the security needs of organizations. However, adversaries are also continuously advancing their toolchains by using AI/ML-enabled methodologies to camouflage their attacks that can evade these ML-based NIDS. Hence, the CSOCs must improve their security posture by proactively preparing for evasion attacks and making their NIDS robust against evolving adversaries.

Evasion attacks on NIDS are mainly conducted by perturbing network flow-based features to deceive ML models. Table 1 shows a summary of recent studies that focused on adversarial sample generation to evade NIDS. However, flow-based attacks are impractical as reverse engineering these perturbations from the flow level into constructing the actual packets is very complex and difficult [41]. In addition, hidden correlations among different flow-based features further exacerbate the computational difficulty of replaying perturbations in a real network communication [15]. More importantly, perturbations must be made such that the communication’s functionality is maintained. Hence, crafting adversarial attacks at the packet level is necessary to improve the practicality of implementing evasion attacks.

Table 1.

Author	Year	Data Set	Attacker Knowledge	Algorithm	Perturbed Features
Rigaki et al. [39]	2017	NSL KDD	White-box	Fast Gradient Sign Method (FGSM), Jacobian-based Saliency Map Attack (JSMA)	Extracted Features
Wang et al. [57]	2018	NSL KDD	White-box	FGSM, JSMA, Deepfool, Carlini Wagner (CW)	Extracted Features
Zhang et al. [60]	2020	CIC-IDS2018	White-box	Boundary Attack, Pointwise Attack, Hopskipjump Attack	Extracted Features
Apruzzese et al. [3]	2020	CTU, BOTNET	Black-box	DRL	Extracted Features
Alhajjar et al. [2]	2020	NSL-KDD, USNW-NB15	Gray-box	Generative Adversarial Network (GAN), GA, Particle Swarm Optimization	Extracted Features
Schneider et al. [44]	2021	NSL-KDD	White-box	Projected Gradient Descent, GA, Particle Swarm Optimization, GAN	Extracted Features
Chernikova et al. [10]	2022	CTU 13	Black-box	Projected Gradient Descent, CW	Extracted Features
Zhang et al. [61]	2022	NSL-KDD, UNSW-NB15	Gray-box	GAN	Extracted Features
Sheatslet et al. [48]	2022	NSL-KDD, UNSW-NB15	White-box	Adaptive JSMA, Histogram Sketch Generation	Extracted Features
Homoliak et al. [20]	2018	ASNM-NBPO	Gray-box	Tools like NetEM, Metasploit	Extracted Features
Hashemi et al. [16]	2019	CIC-IDS2018	White-box	Trial and Error	Extracted Features
Kuppa et al. [26]	2019	CIC-IDS2018	Gray-box	Manifold Approximation	Extracted Features
Han et al. [15]	2021	Kitsune, CIC-IDS2017	Gray-box	GAN	Extracted Features
Sharon et al. [47]	2021	Kitsune, CIC-IDS2017	Black-box	Long Short-Term Memory-based	Extracted Features
Chale et al. [8]	2023	CIC-IDS2017	Gray-box	GA	Raw Packet Features

Table 1. Summary of Recent Literature on Adversarial Sample Generation

A few studies in recent literature have focused on using packet-based data to construct evasion attacks [8, 15, 47]. These studies have utilized publicly available data sets to obtain the samples for obfuscation and relied on making random perturbations using trial-and-error and other approximation techniques. The generated adversarial samples were then tested against linear, tree-based, and nonlinear ML models for evasion. However, a significant drawback in most of these studies is that, even though they consider or collect the packet level data, the perturbations they make are applied to extracted or aggregated packet information, rather than the raw packets themselves. To the best of our knowledge, Chale et al. [8] is the only work in the literature that attempts perturbations on the raw packet itself, bypassing any feature extraction. The limitations of these studies in the literature are summarized as follows. The perturbations made to the samples were mainly focused on the time-based features, which a classifier can be made immune to by training it with raw packet information. Some also generate adversarial samples using packet or payload injection causing packet damage. However, there exists a correlation among the packet-level features, directly impacting the feature set of the classifier, which is not considered in any of these studies. This phenomenon is also known as the side effect of packet mutation [37]. Another limitation of existing packet-based approaches is that they perturb both forward and backward packets (i.e., communication from the host to the destination and then the destination back to the host). Clearly, an adversary can only control the forward packets, those originating from the host and going to the destination (server).

Our proposed methodological framework addresses the above limitations in the following ways. Our methodology uses a learning-based approach, in which an AI agent is trained to make (near-)optimal perturbations to any given malicious packet. The agent learns to make these perturbations in a sequential manner using a deep reinforcement learning (DRL) approach. We identify the forward packets in network communication and only modify them to produce adversarial samples. We evaluate our adversarial samples against classifiers trained using packet-level data. We aim to make minimal and valid perturbations to the original packets that preserve the functionality of the communication. Examples of such perturbations include modifications to the valid portions in the internet protocol (IP) header, transmission control protocol (TCP) header, TCP options, and segment data. Furthermore, we only consider perturbing those features that can be obtained from the raw packet capture (PCAP) files without feature engineering. This makes it practical to replicate the attack using perturbed packets. We consider the side effects of packet mutation in this study. For example, any change to the IP or TCP header affects the IP and TCP checksum, respectively. We formalize the problem of side effects of packet mutation in Section 3.4 and provide a detailed description of the perturbations and their side effects in Section 4.3. We also evaluate whether the learning attained from one environment is transferable to another. We do this to gauge the effectiveness of our methodology in real-world settings where adversaries may not have any knowledge of the ML models and the data used to build the NIDS. We demonstrate the playability of the adversarial packet in a flow using the Wireshark application in the results Section 5 of the article. In summary, our article addresses the literature gap for constructing adversarial samples by developing a learning-based methodology with the following characteristics: only the forward packets are perturbed; valid perturbations are considered in order to maintain the functionality of the packets; side effects of perturbations are taken into account; effectiveness of the adversarial agents is tested against unseen classifiers; and demonstrated transferability of the framework to other network environment.

There are several contributions to this research study. The primary contribution is the development of a DRL-enabled methodology capable of generating adversarial network packets for evasion attacks on ML-based NIDS. Our methodological framework, Deep PackGen, takes raw network packets as inputs and generates adversarial samples camouflaged as benign packets. The DRL agent in this framework learns the (near-)optimal policy of perturbations that can be applied to a given malicious network packet, constrained by maintaining its functionality while evading the classifier. To the best of our knowledge, this is the first research study that poses the constrained network packet perturbation problem as a sequential decision-making problem and solves it using a DRL approach. Another novel aspect of this research is creating a packet-based approach to developing classification models for ML-based NIDS. The unidirectional (forward) packets from raw PCAP files are extracted and processed for machine computation. The transformed network packets are then used to train the classifiers. Other contributions highly relevant to the cybersecurity research community include the insights obtained from the experiments and their analyses. Our investigation reveals that our methodology can generate out-of-distribution (OOD) packets that can also evade the decision boundaries of more complex nonlinear classifiers. Furthermore, we also explain why packets of certain attack types can be easily manipulated compared to others. The knowledge gained from this study on the adversary’s ability to make specific perturbations to different types of malicious packets can be used by the CSOCs to defend against the evolving adversarial attacks.

The rest of the article is organized as follows. In Section 2, we present related literature pertaining to different types of intrusion detection mechanisms and adversarial attacks on ML models. We also present an overview of DRL approaches used in security and other application domains. Section 3 describes the DRL-enabled Deep PackGen framework for adversarial network packet generation. The data set creation process, packet classification model development, and the DRL solution approach are explained in this section. Section 4 discusses the numerical experiments conducted in this study. The performance of our methodological framework on publicly available data sets, the analysis of DRL agent’s policies, and the statistical analysis of the adversarial samples are presented in Section 5. Section 6 presents the insights obtained from this research study, along with the conclusions and future work.

2 Related Literature

We first describe the role of NIDS and review its various types, followed by a summary of recent literature on adversarial attacks on ML models and the use of DRL as a solution approach to solving complex problems in various domains.

2.1 Network Intrusion Detection System

A NIDS is employed to detect unauthorized activities threatening an information system’s confidentiality, integrity, and availability. There are two types of NIDS, signature-based and anomaly-based. A signature-based NIDS matches the signature of an activity with the database comprising signatures of previous malicious activities, while an anomaly-based NIDS models the expected user behavior (benign activity) in a computer and network system to identify any activity outside of this normal behavior (such as a malicious activity). Anomaly-based malicious activity detection can be achieved through ML or statistics-based methods [28]. With advancements in computational technology and availability of faster computing resources, ML is extensively used in anomaly detection, leveraging various techniques such as clustering, deep neural networks (DNN), decision trees (DT), and tree-based ensemble algorithms, among many others [25, 58]. Development of an ML-based NIDS involves fitting a classification model to a training data set containing both malicious and benign data. The trained model is then used to identify malicious activities in the network.

2.2 Adversarial Attacks on ML Models

Recent developments in AI/ML algorithms along with reduced computational costs due to mechanisms such as parallel computing have helped organizations better address their security needs [49]. Concurrently, these AI/ML techniques also provide new opportunities for adversaries to launch attacks, circumventing the improvements in the intrusion detection mechanisms. Adversarial ML (AML) is a growing concern in AI research due to the potential security vulnerabilities it can create in ML models. As per the recent technical report published by the National Institute of Standards and Technology [35], there are three types of attacks that can be performed on ML models: evasion, poisoning, and privacy attacks. In an evasion attack, the attacker’s goal is to generate an adversarial sample by making small perturbations to the original input such that it is misclassified as any other arbitrary class sample. Poisoning attacks are targeted at corrupting the data, model, or labels used during ML model training, aiming to disrupt the availability or integrity of the system. In privacy attacks, the adversary aims to access aggregate statistical information from user records by reverse engineering private user information and critical infrastructure data. Our study focuses on evasion attacks against ML-based models, which pose a big threat to security in an organization.

Adversarial evasion attacks were first observed in computer vision when Szegedy et al. showed that minor changes to an image could cause a DNN to misclassify it [53]. Further research has focused on constructing evasive attacks on various systems in other domains, including attacks on ML-based NIDS in cybersecurity. Evasion attacks on NIDS can be categorized based on the adversary’s knowledge of the classifier [36]. Black-box attacks consider zero knowledge of the classifier, its hyper-parameters, and the features used during training. Gray-box attacks also consider zero knowledge of the classifier but assume some knowledge of the preprocessing functions leading to the training feature set. White-box attacks assume complete knowledge of the classifier and its training feature set. The adversarial capability can be defined based on access to the training data. The capability of the attacker can be categorized as none if the attacker does not have access to the training data, or passive if the attacker has access to both benign and malicious traffic [17]. In computer vision literature, it is assumed that the adversary cannot access the training data. However, in the case of NIDS, it is reasonable to assume the availability of training data. For instance, an adversary can collect data by silently sniffing traffic in a computer network with the help of software and hardware sniffers. Table 1 provides a summary of recent literature studies on evasion attacks by adversarial sample generation against ML-based NIDS. This table shows the data sets used, the types of features exploited, the knowledge of the attacker, and the adversarial methodology utilized in the respective studies.

2.3 DRL Approaches in Security and Other Fields

Reinforcement learning (RL) has a high similarity to human learning. In RL, an agent learns optimal actions through experience, by exploring and exploiting the unknown environment. For solving complex real-world problems to near-optimality, a nonlinear function approximator, in the form of a deep learning model, is integrated within the RL framework. The deep RL or DRL methodology is one of the most successful approaches to solving sequential decision-making problems [31]. There are both model-based and model-free DRL approaches. The latter can be further divided into value-based and policy-based algorithms [52]. One of the popular value-based approaches is the double Q-Learning (DDQN) [55], which is an enhancement over Mnih’s deep Q-learning approach (DQN) [33]. Recent applications of DRL as a solution methodology in various domains include vulnerability management in CSOCs [22], optimization of EV sharing systems [6], training robots to automate complex tasks [24], and developing precision advertising strategies [27], among others.

3 Adversarial Network Packet Generation Framework: Deep Packgen

The objective of our study is to develop a framework for generating adversarial network packets that can bypass ML-based NIDS while maintaining functionality for communication. Our proposed framework, named Deep PackGen, illustrated in Figure 1, comprises of three main components: data set creation, packet classification model development, and adversarial network packet generation. We begin by describing the process of creating and labeling network traffic data, followed by training and evaluating packet classification models. Finally, we present a DRL model trained to generate adversarial network packets by interacting with several packet classification models.

Fig. 1.

3.1 Data Set Creation

While much research has been conducted on developing different ML models for network traffic classification, most of it relies on the NIDS (such as Zeek, Snort, and Security Onion) or the NetFlow tools (such as Wireshark and CICFlowmeter) for compiling network packet information to obtain features for training the models. These methods have several limitations as follows: (i) NIDS and NetFlow tools generate features based on predefined rules or signatures, which makes it difficult to reproduce and compare results across different studies; (ii) these approaches often do not incorporate raw payload information, which can make it hard to detect attacks that are embedded in packet payloads; (iii) flow-based features are extracted by analyzing network traffic over a period of time, which makes it difficult to detect anomalies in real-time; and (iv) rule-based and signature-based feature extraction approaches can fail when encountering novel attacks without signatures. To address these limitations, recent research studies have focused on using raw packet data to train ML-based NIDS [4, 9, 11, 29]. These studies use bidirectional data to train their ML models. However, an adversary can only control the network packets being sent from one direction (i.e., from the source). Hence, in this study, we create a data set comprising raw packet data with a unidirectional flow originating from the source. The packet data from the unidirectional flows is used to train the ML models and to generate adversarial samples (network packets). We start this process by accessing the PCAP files of publicly available network intrusion data sets, such as the ones by the Canadian Institute of Cybersecurity (CIC) (CIC-IDS2017 [45] and CIC-IDS2018 [46]). This process can also be applied to private data sets available at the CSOCs.

Our data set creation process uses a tool that extracts raw packet data from the PCAP files, selects unidirectional packets and processes them by converting the raw data into normalized numeric feature values. The complete procedure for processing and labeling the data is provided in our article for ease of use. The steps of the procedure are as follows.

(1)

Parse the different header and segment information from the PCAP file with the help of parsers such as python-based Scapy [5] and dpkt [51].

(2)

Label the packet data using the 6-tuple information (source IP address, destination IP address, source port, destination port, protocol, and epoch time) to identify benign and malicious packets. For the malicious packets, each sample is assigned the respective attack class label identified in the historical data set (for instance, from the data described on the CIC website for the CIC-IDS2017 and CIC-IDS2018 data sets).

(3)

Extract unidirectional packets using the source IP addresses, i.e., only extracting packets being sent from the identified source, while not including the response from the destination. We will refer to these unidirectional packets as forward packets, hereafter.

(4)

Remove the header information that may increase the bias in training the ML-based classifier. This includes removing the ETH header information, source, and destination IP address information from the IP header, and source and destination port information from the TCP header of each packet data. Figure 2 shows a TCP/IPv4 model with red underlines depicting the location of this information and the number in red fonts representing the number of bytes that will be removed from each layer.

Fig. 2.

(5)

Set the feature-length to N for all the packets. Each of the remaining bytes, after the removal of information in the previous step, will be a feature in this data set. The number of bytes will vary based on the type of packet. Hence, to maintain the standard structure of the data, zero-padding is applied to the feature space [54].

(6)

Convert the raw information in each packet from hexadecimal to decimal numbers between 0–255 and normalize them to be between 0–1 for efficient machine computation.

3.2 Packet Classification Model Development

An adversary may not have complete knowledge of the defender’s model. Hence, an adversary will need a substitute for the defender’s ML-based NIDS to generate and evaluate the adversarial samples. We propose an ensemble model as a surrogate for the defender’s model for training the adversarial agent. An ensemble model consists of multiple estimators (ML models), making the classifier robust in identifying malicious packets. The data set created using the first component of this framework is split into training and testing data sets. Various linear, tree-based, and non-linear ML models are then developed using the training data set and they are evaluated using the testing data set. The selection process of the estimators for this ensemble is described as follows. Given a large set of estimators E, in which each estimator is represented as \(F_m(.)\), where \(m \in E\), then the best set of estimators, M, is selected based on their performance metric values. \(F_{m}^{*}(.)\) represents the estimator with optimal parameters for which its loss function value, \(L(.)\), is minimum, that is,

\begin{equation} \theta ^{*} = \mathop{\text{arg min}}\limits_{\theta } L(\theta , x, y), \end{equation}

(1)

where, \(\theta ^{}\) represents the model parameters, \(x \in X\) is the training data, \(y \in Y\) is the target value, and \(L(.)\) measures how far the predictions are from the target value. Finally, the top \(|M|\) number of classifiers are selected in the ensemble, representing the defender’s ML-based NIDS, shown as follows

\begin{equation} F_{m}^{*}(.) ~~\forall ~~ m \in M. \end{equation}

(2)

3.3 Adversarial Sample Generation

We first provide the general definition of the adversarial sample generation problem, followed by the problem formulation and the DRL-based solution approach.

3.3.1 Problem Definition.

Our aim is to develop a methodology to generate malicious network packets that can fool the defender’s ML-based NIDS. To achieve this, an original malicious packet is perturbed to camouflage it as benign traffic. Unlike the problem of applying unconstrained perturbations to an image to fool a computer vision-based classifier [56], in this problem, the perturbations are constrained by the requirement to maintain the packet’s maliciousness and functionality.

An original malicious packet, \(x_{original}\), is modified by applying perturbation(s), \(\delta\), using a perturbation function, \(P(.)\). These perturbations must belong to a set of all valid perturbations, \(\Delta\), that do not impede the capability of the packet. A perturbed sample, \(x_{p}\), can be defined as

\begin{equation} x_{p} = P(x_{original},\delta). \end{equation}

(3)

\begin{equation} \delta \in \Delta . \end{equation}

(4)

Note that many perturbed samples can be obtained by applying different \(\delta\) from this set of valid perturbations, resulting in a large set of perturbed samples, \(X_p\). However, a successful adversarial sample, \(x_{p}^{benign}\), is the malicious and functional network packet in \(X_{p}\) that is able to bypass the defender’s model by getting misclassified as benign. This can be formally defined as

\begin{equation} x_{p}^{benign} = \mathop{\text{arg max}}\limits_{x_{p} \in X_{p}} L(\theta ^{*}, x_{p}, y). \end{equation}

(5)

3.3.2 Problem Formulation.

Generating an adversarial sample by making perturbations to a network packet can be posed as a sequential decision-making problem. An adversary starts with an original malicious network packet and makes sequential perturbations, as indicated in Equation (3). At each iteration, the packet is modified, and this perturbed sample is passed through the packet classification model to check if it successfully evades its classification decision boundary. The iterative process continues until either a successful adversarial sample is attained (satisfying Equation (5)) or the maximum number of iterations is reached. The objective is to learn the (near-)optimal set of perturbations, given an original malicious network packet, to generate an adversarial sample. This sequential decision-making problem can be formulated as a Markov decision process (MDP). The key elements of the MDP formulation are as follows:

—

State, \(s_t\), is a representation of the information available at time t. The state space consists of the normalized byte values of the network packet obtained from the data set created in the first component of this framework and the classification label (0 for benign and 1 for malicious) given by the defender’s model. Each packet contains N number of features, which makes the state space \(N+1\) dimensional.

—

Action, \(a_t\), represents the perturbation(s), \(\delta \in \Delta\), applied to the network packet at time t. The number of action choices is limited to \(|\Delta |\) and the choices are discrete.

—

Reward, \(r_t\), is the measure of effectiveness of taking action \(a_t\) in state \(s_t\). The reward signal helps the adversary in quantifying the effect of the action taken in a particular state. We engineer a novel reward function to guide the adversary towards learning an optimal policy of making perturbations, given the original malicious network packet. The reward function is defined as follows:

\begin{equation} r_t(s_t, a_t) = {\left\lbrace \begin{array}{ll} r^{-} & if y_{m} \ne benign \forall m \in M \\ k * r^{+} & otherwise \end{array}\right.}, \end{equation}

(6)

where, k is the number of classifiers in the ensemble that were successfully evaded by the perturbed network sample. This function generates both positive and negative rewards. A positive reward is obtained when the perturbed sample evades one or more classifiers in the ensemble model. The reward value is directly proportional to the number of classifiers it is able to fool by getting misclassified as a benign sample. A small negative reward (\(r^{-}\)) is incurred each time the perturbed sample fails to evade any of the classifiers in the ensemble model.

3.4 Side Effects of Packet Mutation

Introducing perturbations to network packets presents a more intricate challenge compared to perturbing images, owing to the distinct constraints inherent in network structures and functionalities. The complexity is further heightened when contemplating the ramifications of the applied perturbations throughout the packet data. Hence, it becomes critical to identify the impacts of perturbing certain features on the other attributes of the packet before applying them to keep it functional. We formalize the problem of side effects of packet perturbation as follows. Let \(z_i\) denote the value of feature i of a packet with N number of features, \(\delta _i\) denote the perturbation applied to \(z_i\), and \(G_{ij}\) (\(z_i\), \(\delta _i\)) denote the function that describes how changing feature i affects feature j of the same packet. We express the relationship between the perturbed and affected features using the system of equations presented below.

\begin{equation} z_i^{\prime } = z_i + \delta _i. \end{equation}

(7)

\begin{equation} z_j^{\prime } = G_{ij} (z_i, \delta _i). \end{equation}

(8)

In the above equations, \(z_i^{\prime }\) is the new value of feature i after perturbing with \(\delta _i\) and \(z_j^{\prime }\) is the impact of this perturbation on feature j. Understanding this relationship between all pairs of features is challenging. Hence, in this study, to ensure the practicality of generating perturbed packets, we selectively modify features that are mutable without compromising the packet’s functionality and whose influence on other features is well understood. An illustrative mutation involves adjusting the window size in the TCP header by a small increment, thereby affecting the TCP checksum value. The computation of the TCP checksum involves summing all TCP header bytes, including the window size. We provide a detailed description of the features we consider for perturbation and their impact on other features in Section 4.3.

3.4.1 DRL-based Solution Approach.

The network packet perturbation problem has a large state and action space. To overcome the issue of calculating and storing the action-value (Q value) for all state-action pairs using a conventional RL approach, we use a DNN architecture to estimate these values. An adversary, in the form of a DRL agent, is trained using the malicious samples from the data set created in the first component of the framework. It is to be noted that the DRL agent has no visibility of the ML model’s architecture, parameter values, or loss function during the training and testing phases. Figure 1 shows the training and testing phases of the DRL agent, which are explained next.

3.4.1.1 DRL Training Phase. Figure 3 shows the training phase of the DRL agent, which comprises interactions between the DRL agent and the training environment. The training environment for the DRL agent is designed with the surrogate for the real-world ML-based NIDS. The environment contains the transformed and labeled network packet data of various attack types, and the pre-trained classifiers for the surrogate ensemble model created in the first and second components of this framework, respectively. For each attack type, the DRL agent obtains a randomly picked data sample and prescribes the perturbation actions (details are explained in the next paragraph). The environment allows for the implementation of these actions, resulting in a one-step transition of the system state, generating a perturbed sample. Rewards are calculated based on the perturbed packet’s ability to evade an ensemble of classifiers. This process continues until the stopping condition is reached, which is either the adversarial sample is successful in being misclassified as benign traffic or the maximum number of time-steps is reached. The steps for simulating the environment are outlined in Algorithm 1.

Fig. 3.

A DRL agent interacts with the training environment and learns to generate adversarial packets by following a set of rules called a policy. The agent’s decisions are based on a sequence of states, actions, and rewards, which are determined by the training environment. The agent is rewarded based on the ability of the generated packets to evade the surrogate model. We use DRL with double Q-Learning (DDQN) [55] to train the agent. DDQN is a single architecture deep Q-network that is suitable for problems with a discrete action space. The notable difference between DDQN and traditional deep Q-learning is that DDQN decouples the action selection and evaluation processes by using an additional network. This helps to reduce the overestimation error that occurs in traditional Q-learning. The target value calculation is as follows:

\begin{equation} R_t = r_t + \gamma Q(s_{t+1}, \mathop{\text{arg max}}\limits_a Q(s_{t+1}, a ; \theta _t) ; \theta _t^\prime) . \end{equation}

(9)

The policy network weights \((\theta _t)\) are used to select the action, while the target network weights (\(\theta _t^\prime\)) are used to evaluate it. Algorithm 2 shows the steps in training the DRL algorithm. The algorithm interacts with the environment (Algorithm 1) to learn the near-optimal policy for perturbing different types of packets. The learning process continues until a pre-determined maximum number of episodes is reached. Multiple DRL agents are trained, one for each attack class in the data set.

3.4.1.2 DRL Testing Phase. The trained DRL agents are tested against different ML models in the testing phase of the adversarial sample generation component of the framework, as depicted in Figure 1. The neural network architecture of the agent uses the learned weights (\(\theta\)) obtained at the conclusion of the training phase (see Algorithm 2). The trained DRL agent operates without any reward signal during this phase and its actions are based on its learned policy. Figure 4 illustrates the testing phase of the DRL agent, during which adversarial samples are generated from testing samples that were not seen during the training phase. This allows for an assessment of the agent’s performance on various classifiers, including those that were not used in the DRL training environment. The performance of each agent is measured using the adversarial success rate (ASR) metric, which is defined as follows:

\begin{equation} ASR = \frac{FN_{p} - FN_{original}}{TP}, \end{equation}

(10)

where \(FN_p\) denotes the total number of samples that were misclassified after perturbation by the DRL agent, \(FN_{original}\) is the total number of samples that were incorrectly classified before perturbation, and TP is the total number of samples that the ML model correctly classified as malicious before perturbation. The ASR does not take into account packets that fool the classifier prior to perturbation (i.e., \(FN_{original}\)), thereby giving an accurate performance measurement for each agent.

Fig. 4.

4 Numerical Experiments

In this section, we outline the numerical experiments performed to evaluate our Deep PackGen methodology. We first discuss the experimental data, followed by the creation of ML-based packet classification models. Finally, we delve into the hyperparameters employed during the training and testing of the DRL agent. The goal is to train multiple DRL agents, each specifically designed to generate packets for a unique attack type, by interacting with a surrogate model specialized in identifying that type of attack. To achieve this, we generated several data sets for training and testing of the DRL agents.

4.1 Data Description

We conducted numerical experiments using raw PCAP files from two popular network intrusion detection data sets: CIC-IDS2017 [45] and CIC-IDS2018 [46]. These data sets contain both benign and attack communications, and provide a pragmatic representation of modern network traffic compared to older data sets like NSL-KDD and KDD-CUP [19]. Additionally, the availability of raw PCAP files for the CICIDS data sets reduces dependency on extracted flow level features [42]. CIC-IDS2017 consists of PCAP files for five consecutive days (Monday to Friday), each with different attack types and sizes, as shown in Table 2. We processed these files to generate the data set, as explained in Section 3. Table 3 displays the number of forward packets extracted for each attack and its subtypes.

Table 2.

Day/Date	File Size	Activity
Monday/ July 3, 2017	10 GB	Benign
Tuesday/ July 4, 2017	10 GB	Brute Force and Benign
Wednesday/ July 5, 2017	12 GB	DoS and Benign
Thursday/ July 6, 2017	7.7 GB	Web Attack, Infiltration, and Benign
Friday/ July 7, 2017	8.2 GB	Botnet, Port Scan, DDoS, and Benign

Table 2. CIC-IDS2017 PCAP File Details

Table 3.

Attack Type	Subtype	Total Packets CIC-IDS2017	Total Packets CIC-IDS2018
Attack Type	Subtype	Fwd	Fwd
DoS	GoldenEye	66,795	160,196
	Hulk	1,246,802	1,026,987
	Slowhttptest	32,635	105,550
	Slowloris	37,236	85,030
DDoS	-	754,735	546,256
Heartbleed	-	28,412	-
Botnet	-	5,788	-
Infiltration	-	29,881	238,087
Port Scan		162,360	-
Web Attack	Brute Force	19,755	19,875
	XSS	6,361	20,797
	SQL Injection	67	334

Table 3. Packet Details Per Attack Type in CIC-IDS2017 and CIC-IDS2018 Data Sets

To ensure balanced data sets for our experiments, we followed these steps: First, we identified and removed attack types and subtypes with low flow instance counts to facilitate efficient training of packet classification models and DRL agents. Specifically, we excluded the Heartbleed and Botnet attack types, as well as the SQL Injection subtype from the Web Attack category. Next, we downsampled the flow instances of Denial of Service (DoS), Distributed Denial of Service (DDoS), and Port Scan attacks to achieve a near-equal representation of packets across all attack types. Due to substantial differences in both the structural characteristics and execution methodologies of various attacks, we trained individual DRL agents to specialize in distinct attack types. Each DRL agent underwent training using a surrogate packet classification model, developed with an equal number of benign and attack packets for the respective attack type. In constructing the surrogate classifiers, we employed a random selection process to choose attack samples from each attack type, along with an equal number of samples from benign communications. As discussed earlier, we only considered forward packets in our data set as an adversary is in control (generation and manipulation) of packets that originate from its source. We extracted payload bytes from each packet and represented each byte as a feature in this data set. We converted hexadecimal numbers to decimal numbers and normalized each feature value to a range of 0–1, where the minimum and maximum feature values were 0 and 255, respectively. In total, there were 1,525 features.

We utilized the CIC-IDS2017 data set for training and testing our framework as follows. We divided the data into different attack types, including samples from the benign category. For each attack type, we split the data into three parts: 60% of data for training the DRL agent and building the surrogate model for the training phase, 30% of data for building other packet classification models for testing the trained DRL agent, and 10% of data for generating adversarial samples and performing evaluation in the testing phase. In the rest of the article, we will refer to them as training, ML model testing 1, and DRL agent testing 1 data sets, respectively, for each attack type.

Further, to evaluate the performance of the trained agents on different network traffic data, we utilized the CIC-IDS2018 data set. Table 4 shows the different attack types and sizes of the PCAP files in the CIC-IDS2018 data set. We extracted packets for the various attack types, including DoS, Web Attack, Infiltration, Port Scan, and DDoS from these files. Table 3 displays the number of forward packets for them. We follow the same process of balancing the instances of attacks across all attack types, as was done with the CIC-IDS2017 data set. We split the data for each attack type into two parts: 70% of data was allocated for building the ML models for the detection of the respective attack type in the testing phase and the remaining 30% of data was utilized to measure the ASR of the respective trained DRL agent. We refer to these two parts as ML model testing 2 and DRL agent testing 2 data sets, respectively. Figure 5 shows a schematic of the above-mentioned data splitting strategy from the two CICIDS data sources and their respective uses in the training and testing phases of the framework. Note that the DRL agents were not trained with the CIC-IDS2018 data samples.

Table 4.

Day/Date	File Size	Activity
Wednesday/ Feb 14, 2018	147 GB	SSH and FTP Patator, and Benign
Thursday/ Feb 15, 2018	57.8 GB	DoS Goldeneye, DoS Slowloris, and Benign
Friday/ Feb 16, 2018	460 GB	DoS Slowhttptest, DoS Hulk, and Benign
Wednesday/ Feb 21, 2018	97.5 GB	DDoS and Benign
Thursday/ Feb 22, 2018	110 GB	Web Attack and Benign
Friday/ Feb 23,2018	65.9 GB	Web Attack and Benign
Wednesday/ Feb 28, 2018	73.1 GB	Infiltration and Benign

Table 4. CIC-IDS2018 PCAP File Details

Fig. 5.

4.2 Packet Classification Model Creation

We developed different sets of ML models for both training and testing the DRL agents. Data sets were carefully prepared for developing the surrogate (training phase) and the testing (testing phase) models. ML models in the training phase were developed using the training data. In contrast, those used for testing the trained DRL agents were developed using either the ML model testing 1 or ML model testing 2 data sets (see Figure 5). A DRL agent was trained to perturb packets for each attack type. To effectively train the agent to deceive the classifier’s decision boundary, we used a surrogate model specializing in that respective attack type in the agent’s training environment. For example, if the DRL agent was being trained on perturbing the packets of a Port Scan attack, then the surrogate model was trained with the forward packets extracted from the network flow data of the same attack.

In the training phase, we selected an ensemble of ML models to act as a surrogate model. We randomly sampled 80% of the training data to train various ML models, including linear, tree-based, and nonlinear classification models. The performances of all these models on the remaining 20% of the training data were comparable across each attack type, with the majority of them having a superior accuracy of around 99%. We selected one model from each of the three types of classifiers in the ensemble: logistic regression (LR), DT, and multi-layer perceptron (MLP). Table 5 shows the accuracy scores of these models that form the ensemble in the training environment.

Table 5.

Classifiers	LR	DT	MLP
Attack Type	LR	DT	MLP
DoS	99%	99%	99%
DDoS	96%	99%	99%
Web Attack	99%	99%	99%
Port Scan	99%	99%	99%
Infiltration	99%	99%	99%

Table 5. Accuracy Scores for Models Selected in the Ensemble

Similarly, ML models were developed for the testing phase. Two sets of models were trained: one using the ML model testing 1 data set and another using the ML model testing 2 data set. Note that both these data sets contain previously unseen samples by the DRL agents. In addition, the latter contains samples from a different network than that used to train the agents. We used a similar split of 80% on these data sets to train the respective sets of models. The hyperparameters used for training the various models, including random forest (RF), DNN, and support vector machine (SVM), among others, are shown in Table 6. The values of some of the hyperparameters were adopted from literature [12, 21], and others were experimentally determined. The testing accuracy scores of a sample list of models for each attack type on the remaining 20% of the respective data sets are shown in Table 7. All these models performed very well in accurately differentiating samples between the benign and malicious classes.

Table 6.

Classifier	Hyperparameters
DT	criterion	max_depth	min_samples_split	max_features	ccp_alpha
	gini	1,500	1	39	0.05
RF	n_estimators	max_features	ccp_alpha	criterion	max_depth
	200	sqrt	0.04	gini	100
MLP	hidden_layer_sizes	activation	solver	batch_size	learning_rate
	(100,)	relu	adam	200	constant
DNN	hidden_layer_sizes	activation	solver	batch_size	learning_rate
	(256,128,32,)	relu	adam	200	constant
SVM	C (regularization parameter)	kernel	degree	gamma	-
	1.0	rbf	3	scale	-

Table 6. Hyperparameter Values for ML Models

Table 7.

Data Set	CIC-IDS2017					CIC-IDS2018
Classifier	DT	RF	MLP	DNN	SVM	DT	RF	MLP	DNN	SVM
Attack Type	DT	RF	MLP	DNN	SVM	DT	RF	MLP	DNN	SVM
DoS	99%	99%	99%	99%	99%	99%	99%	99%	99%	98%
DDoS	99%	99%	99%	99%	99%	99%	99%	99%	99%	98%
Web Attack	99%	99%	99%	99%	99%	99%	99%	99%	99%	88%
Port Scan	99%	99%	99%	99%	99%	-	-	-	-	-
Infiltration	99%	99%	99%	99%	99%	99%	98%	97%	98%	90%

Table 7. Testing Accuracy Scores of Trained ML Models for Testing Phase

4.3 Adversarial DRL Agent Training

The state space of the DRL agent consists of the 1,525 features extracted from the network packet and its classification label. As discussed in the data set creation component (Section 3.1) of the framework, these features are the normalized values of the bytes pertaining to different TCP/IP header and segment information (see Figure 2). Our focus in this study is to find the (near-)optimal set of perturbations that can be applied to a given malicious network packet to generate a successful adversarial sample, while maintaining the functionality of communication. To show the effectiveness of our methodology, we selected a set of valid perturbations (\(\Delta\)) based on domain knowledge and insights from literature studies [1, 3, 14, 15, 23, 32, 34, 43, 50, 59]. Below is a list of perturbation categories we selected as a part of the agents’ action space along with their descriptions and side effects.

—

Modifying the fragmentation bytes from do not fragment to do fragment.

—

Description: This perturbation can be applied to packets in which fragmentation is turned off. The do fragment command takes the hexadecimal value 00. Turning this flag on breaks the packet into multiple parts, however, no payload information from the packet is deleted or modified.

—

Side effects: This perturbation directly affects byte numbers 7 and 8, and indirectly affects byte numbers 9, 11, and 12 of the IP header, where byte 9 represents the time to live (TTL) value, and bytes 11 and 12 represent the IP checksum value. IP checksum can be calculated by adding all the elements present in the IP header skipping only the checksum bytes [40]. The checksum value changes with the change in the byte value of the fragmentation bytes. The TTL value is also adjusted as a result of fragmentation [38].

—

Modifying the fragmentation bytes from do not fragment to more fragment.

—

Description: This perturbation can be applied to packets where fragmentation is turned on or off. The more fragment command takes the hexadecimal value 20. Turning this flag on signifies there are more fragments incoming from the same flow. This also does not alter any of the packet information to be transmitted.

—

Side effects: This perturbation directly affects byte numbers 7 and 8, and indirectly affects byte numbers 9, 11, and 12 of the IP header.

—

Increasing or decreasing the TTL byte value.

—

Description: Any valid perturbation to this byte will result in a final TTL value between 1 and 255. If the TTL value is decreased significantly then it might cause the loss of functionality. Hence, we change this value by a small amount (+/-1) to make sure the TTL value does not change substantially.

—

Side effects: This perturbation directly affects byte number 9, and indirectly affects bytes 11 and 12 of the IP header.

—

Increasing or decreasing the window size bytes.

—

Description: Any valid perturbation to these bytes will result in a final window size value between 1 and 65,535. The TCP operates as a connection-oriented protocol, wherein meticulous tracking of transmitted data occurs. The sender transmits data, necessitating acknowledgment from the receiver. In instances where acknowledgment is not received within the stipulated timeframe, the sender initiates the retransmission of data. TCP incorporates a mechanism known as windowing, wherein a sender dispatches one or more data segments, and the receiver acknowledges either a single segment or multiple segments. At the initiation of a TCP connection, hosts employ a receive buffer to temporarily store data until it can be processed by the application. Upon the receiver issuing an acknowledgment, it conveys to the sender the amount of data that can be transmitted before another acknowledgment is expected. This parameter is termed the window size, serving as an indicator of the receive buffer’s capacity. We change this value by a small amount so that it does not alter the functionality or maliciousness of the packet.

—

Side effects: This perturbation directly impacts byte numbers 15 and 16 of the TCP header, and indirectly impacts byte 17 and 18 of the TCP header, which represent the TCP checksum, similar to IP checksum [7, 40].

—

Adding, increasing, or decreasing the maximum segment size (MSS) value.

—

Description: This perturbation can only be applied to SYN and SYN-ACK packets. For packets that already have the MSS option, we only increase/decrease the value. The MSS value is limited between 0–65535. The TCP options do not have a specific order, and the MSS options have a length of 2 or 4 bytes. The TCP MSS option delineates the maximum size of a TCP segment that a given TCP endpoint is capable of accommodating. This option is exchanged by both communicating parties at the initiation of a connection. Notably, the MSS option serves to impose a constraint on the quantity of data permissible in each individual TCP segment. We modify the MSS value by either increasing or decreasing it by +/-1. Typically, this value is set to 1,460 bytes. Therefore, when decreasing this feature, we ensure that the MSS value does not decrease substantially below 1,460.

—

Side effects: This perturbation also indirectly affects byte numbers 17 and 18 of the TCP header.

—

Adding, increasing, or decreasing the window scale value.

—

Description: Window scaling is used to expand the window size beyond 65,535 bytes. It multiplies the window size to increase the available buffer’s size. The window scale value is limited between 0–14. The window scale options have a length of 1 or 2 bytes. This perturbation can only be applied to SYN and SYN-ACK packets. For packets that do not have the window scale by default, we add the window scaling to the SYN or SYN-ACK packets, while for packets that already have the window scale option, we only increase/decrease the value. Tweaking the window scaling does not impact the payload information resulting in no loss of functionality or maliciousness.

—

Side effects: This perturbation also indirectly affects byte numbers 17 and 18 of the TCP header.

—

Adding segment information.

—

Description: For this perturbation, we selected the most commonly occurring TCP payload information in the data set. Each time this action is chosen, a portion of the above TCP payload is sequentially added as dead bytes to the end of the malicious packet’s TCP payload information. Since we add the dummy payload at the end of the segment and do not modify information that is already in the attack packets, the functionality of the perturbed packets remains intact [15]. Note that this dummy payload can be carefully crafted by an adversary in a real-world scenario to satisfy certain structural and formatting requirements of the desired attack. For example, an adversary can add multiple HTTP parameters with the same name to bypass input validation or modify internal variable values in an HTTP parameter pollution exploitation. The adversary can then trigger an XSS attack by passing an empty string with an extra argument [32].

—

Side effects: This perturbation affects byte values in the TCP segment and indirectly affects byte numbers 17 and 18 of the TCP header.

We tried various reward schemes with different values of positive and negative rewards in our reward function (see Equation (6)). We obtained the best results, in terms of higher average reward value and a faster convergence with the reward term values as follows. We assigned a value of 200 to \(r^+\) and -2 to \(r^-\). If the perturbed sample successfully evaded all three classifiers in the ensemble model, then \(r_t=600\) was passed on to the DRL agent. If the sample successfully evaded only two (one) of the three classifiers, then the DRL agent received a reward of 400 (200). However, if the sample failed to evade any of the classifiers, then a negative reward of \(r_t=-2\) was assigned to the agent at time t.

We conducted the experiments on a machine with 12^th Generation Intel Core i9-12950HX processor (30 MB cache, 24 threads, 16 cores) with NVIDIA RTX A5500 graphics card (16GB GDDR6 SDRAM). Table 8 shows the different hyperparameter values used in Algorithm 2 for training the DRL agent. We performed controlled exploration by employing an \(\epsilon\)-greedy exploration approach with exponentially decaying value of \(\epsilon\). The policy network and the target network implemented in the experiments are similar, and the latter is updated with the policy network weights every 10 time-steps. To expedite convergence, Kaiming normal initialization is used instead of random initialization for neural network weights [18]. The maximum number of training episodes is chosen based on the improvement curve of the moving average of episodic rewards observed during DRL agent training. Figure 6 depicts one such moving average curve during the training of a DRL agent aimed at generating adversarial packets using the training data samples for the Port Scan attack type. We noted that the average reward plateaus around 50 K episodes across all attack types during the training phase.

Table 8.

Hyperparameter	Value
No. of Training Episodes	50,000
Max. Episode Length	30
Batch Size	256
Gamma	0.8
Exploration Strategy	\(\epsilon\)-greedy
Epsilon Start	1
Epsilon End	0.01
Epsilon Decay	0.00002
Memory Size	100,000
Learning Rate	0.001
Target Update Frequency	10
Number of Hidden Layers	3
Hidden Layer Architecture	(256,128,64,)
Activation	Relu
Weight Initialization	Kaiming Normal

Table 8. Hyperparameter Values for Algorithm 2

Fig. 6.

5 Results and Analysis

This section discusses the results of the conducted experiments and their analysis. First, we present the performances of the trained DRL agents against the various testing models, followed by an analysis of the agents’ decision-making. We then compare our approach with another method from the literature that directly perturbs packets to create evasion attacks. Finally, we delve into a deeper statistical analysis of the successful adversarial samples.

5.1 Performance Evaluation of the Trained DRL Agents

We evaluate the performance of the trained agents using the DRL agent testing 1 (CIC-IDS2017) data set. We quantify their performance by calculating the rate of adversarial samples that successfully bypass the classification boundary of each testing model by getting misclassified as benign. We use the ASR metric (see Equation (10)) to report each agent’s performance. Note that in calculating this performance metric value, the packets that fool the classifier prior to perturbation (i.e., \(FN_{original}\)) are disregarded (subtracted) to accurately measure the effectiveness of the agent’s learned policy.

Table 9 presents the ASR values for the five trained DRL agents (one for each attack type) on five testing models trained using the ML model testing 1 (CIC-IDS2017) data set. The average ASR value obtained across all DRL agents and testing models was 0.664. The DT classifier was found to be the easiest to fool by all agents, with ASR values greater than 0.96 (as shown in the DT column of Table 9). The DRL agents also performed well against the RF classifier, a tree-based ensemble model, with an average ASR value of 0.694. In particular, the DDoS and Port Scan agents had similar success rates against both DT and RF classifiers. Some DRL agents, such as Infiltration and Web Attack, had lower success rates against more complex nonlinear models, such as DNN and SVM classifiers. In general, we observed that simpler models were easier for the DRL agents to evade the decision boundary through adversarial sample generation. The DDoS and Port Scan agents performed the best against all types of models, while the perturbed packets generated by the Infiltration agent had a lower success rate in fooling the nonlinear classifiers.

Table 9.

Attack Type	Testing 1 Models
Attack Type	DT	RF	MLP	DNN	SVM
DoS	0.969	0.539	0.472	0.786	0.272
DDoS	0.965	0.965	0.990	0.679	0.742
Web Attack	0.995	0.656	0.654	0.330	0.322
Port Scan	0.995	0.985	0.979	0.314	0.982
Infiltration	0.998	0.324	0.209	0.158	0.323

Table 9. ASR of DRL Agents on CIC-IDS2017 Data

Next, we evaluate the transferability of the learned policies of the DRL agents to a different environment (testing 2). To accomplish this, we employ the CIC-IDS2018 data set. The five testing models are trained using the ML model testing 2 data samples. The DRL agents, which were trained using the CIC-IDS2017 data, are subjected to malicious samples from the CIC-IDS2018 data set (i.e., DRL agent testing 2 data samples). Note that there are no Port Scan attack samples in this data set, so we present the evaluation of the other four DRL agents in this environment. Table 10 shows the transferability evaluation of these agents using the ASR metric. Overall, the agents successfully perturb malicious samples that evade the classification boundaries of the various testing models in this new environment, with an average ASR value of 0.398. We observed that the DRL agents were more successful in fooling the tree-based models than the nonlinear models. Notably, the DDoS agent performed the best, consistent with the findings of the testing 1 environment. Figure 7 depicts the average ASR values of the DRL agents in both testing 1 and testing 2 environments, highlighting their performance in generating successful adversarial samples.

Table 10.

Attack Type	Testing 2 Models
Attack Type	DT	RF	MLP	DNN	SVM
DoS	0.506	0.455	0.355	0.225	0.104
DDoS	0.657	0.810	0.372	0.679	0.372
Web Attack	0.312	0.298	0.283	0.293	0.254
Infiltration	0.293	0.354	0.590	0.410	0.340

Table 10. Transferability Evaluation (ASR) of DRL Agents on CIC-IDS2018 Data

Fig. 7.

5.2 Performance Analysis of the Agents

We now assess the effectiveness of the DRL agents by examining two key aspects. First, we analyze why packets of certain attack types were easier to perturb than others. Second, we examine the (near-)optimal actions learned by the agents and their relevance to the decision boundary of the classifiers. To accomplish this, we implemented the following two steps. (i) We calculated the mean and standard deviation values of the first 500 normalized features from the network packets of each attack type in the CIC-IDS2017 data set. These values were then compared with those obtained from the benign packets to determine similarity (or dissimilarity) in feature values. This information is visualized in Figure 8, which displays the plots for the five attack types. (ii) We used SHapley Additive exPlanations (SHAP) [30] to identify important features that determine the classification boundary for accurately detecting each attack type in the testing models. We analyze the agent’s performance and action choices in relation to these key features and the ASR values reported in Table 9. Figure 9(a)–(d) shows the mean absolute SHAP values of the top features found in two classifiers, RF and MLP, developed for detecting two of the attack types, Port Scan and Infiltration.

Fig. 8.

Fig. 9.

From Figure 8 and Table 9, we can infer that attack types with packet feature values similar to those of the benign class are more susceptible to successful perturbation, allowing evasion of the classifiers. Specifically, plots (b) and (e) in Figure 8 show that the malicious packets from DDoS and Port Scan attacks, respectively, have similar feature values as the benign packets, resulting in higher ASR values for the DRL agents trained to perturb them, as seen in Table 9. Next, we look at the agent’s actions and important features of the classifier, and analyze how that impacted the ASR.

We show two examples of successful perturbation by the Port Scan agent. We plot the first 100 features of an original Port Scan packet and its respective adversarial sample. Figure 10(a) shows an adversarial sample with a successful perturbation on the ninth feature (IP header byte number 9) and Figure 10(b) demonstrates the successful addition of payload information in bytes 60–100. Both these actions, increasing the TTL values (directly affecting IP header byte numbers 9, 11, and 12) and adding payload (segment information), were amongst the top actions performed by the Port Scan agent during testing. We used the SHAP values of the classifiers to determine any correlation between the agent’s actions and the important feature(s) governing their decision boundaries. Figure 9(a)–(b) shows the SHAP values for the features in the RF and MLP classifiers, respectively, used to detect Port Scan attacks. Notably, IP header byte number 9 has the most significant contribution towards deciding the decision boundary for both models, which was also learned by the Port Scan agent. Figures 11 and 12 display the Port Scan packets before and after the successful perturbation of the TTL value, respectively, using the Wireshark packet analyzer. The blue box highlights the perturbed feature, while the red box illustrates the feature impacted by mutation, a factor considered in our study.

Fig. 10.

Fig. 11.

Fig. 12.

The DRL agents’ performance analysis reveals that some attack types were less successful than others, as shown in Table 9. The Infiltration attack type had the lowest success rate, likely due to its dissimilar feature values compared to the benign packets (see Figure 8(d)). Note that the Infiltration packets have unique feature values in bytes 61–1525, representing segment information, making it challenging to perturb these packets by modifying these features. Figure 10(c) shows an example of a successful perturbation by the Infiltration agent against an RF classifier. As seen in this figure, the agent was successful by making a modification in the window size feature value (directly impacting TCP header byte numbers 15 and 16). Figure 9(c) shows that the window size feature is the top contributor to the decision boundary for the RF classifier, which explains why perturbing window size value is amongst the top action choices for the Infiltration agent. Figures 13 and 14 show the original Infiltration attack packet and the DRL agent-generated sample with perturbed window size value, respectively, using Wireshark. These examples, amongst others, demonstrate that the DRL agents learned which key feature(s) to perturb for evading the classification boundary.

Fig. 13.

Fig. 14.

Notably, the early part of the TCP segment plays an important role in determining the MLP model’s classification boundary (see Figure 9(d)). However, the early TCP segment information is not mutable for this attack type as that would compromise the functionality of the packet. Hence, the agent relied on adding payload (dead bytes) to these packets as an action choice and succeeded in some cases. An example of such a perturbation against an MLP classifier is shown in Figure 10(d). The difference in the important features between classifiers also explains the Infiltration agent’s relatively higher success rate against the RF classifier compared to the MLP classifier. A similar performance analysis was conducted for the other DRL agents across all the classifiers. We found a similar correlation between the agent’s policy, the important features of the classifier, and the respective ASR.

5.3 Performance Comparison Against a Genetic Algorithm-based Approach from the Literature

We compared the performance of our approach with another method from the literature by Chale et al. [8]. While there are a few studies in literature (refer to Table 1) that incorporate or utilize packet data in their experiments, the majority predominantly apply perturbations to engineered features, directly affecting the functionality of the packets. In contrast, Chale et al. formulated a constrained optimization problem to craft functional packets and developed a meta-heuristic approach to modify raw features without engineering new ones. They perturbed the packets using a variation of the genetic algorithm (GA), repeatedly substituting units of code with functionally equivalent counterparts. Their study focused on three primary attack categories: Infiltration, DoS (with attack subtypes Slowloris and Hulk), and Brute Force (with attack subtype SSH-Patator). In their study, they were unable to execute successful evasion attacks for the latter category (Brute Force SSH) on previously unseen testing classifiers. The authors applied their methodology to packets extracted from the CIC-IDS2017 data set using a surrogate model. Subsequently, they subjected the adversarial samples to three testing classifiers powered by a convolutional neural network (CNN), a fully connected neural network (FNN), and an Adaboost architecture, respectively, for performance evaluation.

We followed the steps outlined in their article to construct their model architecture and conducted a comprehensive and equitable comparison. We randomly selected 1,000 previously unseen packets from Infiltration, Slowloris, and Hulk attacks. These packets were perturbed using the trained adversarial DRL agents against the three testing classifiers. Notably, in their methodology, a CNN model is used as a surrogate model with a different architecture than the one in the testing classifier. In contrast, in our methodology, we did not use any of these three classification methods as our surrogate models. Table 11 presents the performance comparison between both approaches. Our DRL-based approach achieved an average ASR of 49%, while the use of GA for perturbation resulted in a 20% success rate across all the attacks. We observed that the GA-based method was slightly more effective against the CNN classifier, with an average margin of 4%. This difference may be attributed to the use of a CNN surrogate model in their adversarial training. Our DoS agent performed exceptionally well against both Adaboost and FNN classifiers, surpassing the performance of the GA-based evasion approach for the Slowloris and Hulk attacks. Similarly, the Infiltration agent demonstrated superior performance in evading all three classifiers compared to the other method. Overall, our Deep PackGen framework successfully evaded previously unseen classifiers across all examined attack types, including those not explicitly addressed by Chale et al., such as Web Attack, Port Scan, and DDoS (as shown in Table 9).

Table 11.

Attack	Testing Classifiers
	Adaboost		FNN		CNN
	DRL	GA	DRL	GA	DRL	GA
Infiltration	0.06	0.02	0.12	0.02	0.23	0.11
Slowloris	0.98	0.40	0.51	0.10	0.42	0.64
Hulk	0.96	0	0.78	0	0.35	0.37

Table 11. Performance Comparison (ASR) between Deep Packgen (DRL) and GA Perturbation Approaches

5.4 Statistical Analysis of the Successful Adversarial Packets

Finally, we statistically analyze the successfully perturbed adversarial samples to determine their probabilistic difference from the original samples. We conduct the Kolmogorov-Smirnov test (K-S test) to quantify this difference. The two-sample K-S test compares the empirical cumulative distribution functions (eCDFs) of the perturbed adversarial sample and the original malicious sample to determine whether the former comes from the same distribution as the latter. It gives the maximum distance between the two eCDFs, also known as the K-S statistic (D) [13]. If the calculated value of D is greater than the critical value for the specified significance level, then the null hypothesis is rejected, indicating that the samples do not come from the same distribution. We report the percentage of successful adversarial samples that were found to be OOD in Tables 12 and 13 at 95% significance level. The former table shows the percentage of successfully perturbed adversarial samples from a different distribution than their original malicious samples in the CIC-IDS2017 data set. The latter shows the percentage using the CIC-IDS2018 data set.

Table 12.

Attack Type	Testing 1 Models
Attack Type	DT	RF	MLP	DNN	SVM
DoS	0	25.21	50.61	48.15	51.39
DDoS	85.86	82.61	85.11	74.19	76.48
Web Attack	0.95	86.23	40.71	26	92.45
Port Scan	0.07	0.35	49.64	19.93	17.10
Infiltration	0	0.30	4.20	37.16	5.03

Table 12. % of Out-of-distribution Adversarial Samples Generated using CIC-IDS2017 Data

Table 13.

Attack Type	Testing 2 Models
Attack Type	DT	RF	MLP	DNN	SVM
DoS	0	57.58	39.62	51.73	76.06
DDoS	61.14	71.48	52.55	77.76	45.43
Web Attack	23.43	81.68	75.92	76.06	36
Infiltration	26.49	50.13	48.68	63	52.87

Table 13. % of Out-of-distribution Adversarial Samples Generated using CIC-IDS2018 Data

We noted that the most successful adversary in both testing environments, the DDoS agent, also generated the highest percentage of OOD samples. It can be inferred from Tables 12 and 13 that the other agents did not need to make significant changes to the packets to fool the DT classifier, and as a result, they generated fewer OOD samples. We also noticed that a lower ASR value in Tables 9 and 10 was correlated with a higher percentage of OOD samples, suggesting that these samples required more significant perturbations to evade the respective classifier. For example, the DoS agent had to mainly rely on generating OOD samples to be successful against the SVM models. On average, over 45% of the successful adversarial samples generated by the DRL agents across all testing models in both environments were OOD. In summary, our DRL-enabled adversarial network packet generation methodology, Deep PackGen, has shown encouraging results for generating OOD samples, which can inspire further research in this area to strengthen defenses against adversarial evolution.

6 Conclusions and Future Directions

In this article, we presented the development of a generalized methodology for creating adversarial network packets that can evade ML-based NIDS. The methodology is aimed at finding (near-)optimal perturbations that can be made to malicious network packets while evading detection and retaining functionality for communication. We posed this constrained packet perturbation problem as a sequential decision-making problem and solved it using a DRL approach. The DRL-enabled solution framework, Deep PackGen, consists of three main components, namely packet-based data set creation, ML-based packet classification model development, and DRL-based adversarial sample generation. Raw packet capture files from publicly available data were used to conduct the experiments. The framework generated curated data sets containing forward network packets, which were used to train and test five different types of DRL agents. Each agent was tailored to a specific attack type and evaluated on various classifiers. Results show that the Deep PackGen framework is successful in producing adversarial packets with an average ASR of 66.4% across all the classifiers in the network environment in which they were trained. The experimental results also show that the trained DRL agents produce an average ASR of 39.8% across various tree-based and nonlinear models in a different network environment.

6.1 Key Insights from the Study

Below, we present a summary of the insights obtained from this study that can guide future investigations.

—

The DRL agents have a higher success rate in evading tree-based packet classification models like DT and RF compared to nonlinear classifiers such as SVM and DNN.

—

The success rate of the DRL agents in generating adversarial samples is directly related to the key features that govern the decision boundary of the classifier and whether these features could be changed without disrupting the packet’s communication function.

—

Attacks that have feature values similar to those of benign traffic, such as DDoS and Port Scan, are more vulnerable to successful perturbation by an adversarial agent.

—

The more complex the decision boundary of the classifier, the larger the magnitude of the perturbation required for evasion, resulting in OOD samples.

—

The policies learned by the DRL agents are transferable to new network environments.

6.2 Research Scope and Pathways for Future Exploration

While our work addresses a specific set of objectives and parameters, below we provide a broader context for this study and highlight promising areas for future research and collaboration between academia and industry.

—

ML-based NIDS focus: Our framework is designed specifically for evading ML-based NIDS classifiers, an important type of NIDS increasingly adopted in dynamic cybersecurity environments. These classifiers leverage advanced algorithms to detect anomalous behaviors that may evade traditional signature-based systems. While ML-based systems present a unique target for adversarial attacks, the insights and methodologies developed in this study could be extended to signature-based and commercial NIDS platforms. However, these systems operate on fundamentally different principles, requiring distinct attack and evaluation strategies. Expanding our framework to assess their vulnerabilities would provide a fuller understanding of how various types of NIDS respond to adversarial techniques. This work could have significant implications for practical applications and can be best undertaken in collaboration with industry experts who have insights into the specific configurations and limitations of commercial NIDS products. Through this collaborative research, the cybersecurity community could make further progress toward developing robust, adaptable defenses against diverse intrusion tactics.

—

End-to-end attack success: Achieving end-to-end success for an adversarial attack within a live network context introduces additional complexities, as the mutated packets must also maintain intended malicious behavior across various network components, including routers, operating system kernels, and application layers. Our study relies on domain knowledge and established research to make reasoned assumptions that the crafted perturbations would preserve malicious functionality, which we validated using packet integrity checks in Wireshark. In a production environment, however, achieving consistent end-to-end attack success may require a more extensive validation framework, which can confirm packet integrity as well as full operational impact of adversarial actions across different network elements. Developing such an oracle could be instrumental for assessing and refining adversarial techniques across diverse system architectures and configurations, ultimately providing a more complete understanding of the real-world viability of these attacks.

—

Availability of training data: Our study assumes that an adversary can access network traffic data, either from publicly available sources or via synthetic generation, for the purpose of training. This assumption is based on the reality that network packets are generally structured according to standardized protocol regulations, meaning that packet formats and certain traffic patterns are consistent across organizations. This common structure allows adversarial techniques to be generalized beyond the confines of any specific network, enhancing the transferability of the attack policies generated by our DRL agent. Moreover, this standardization lowers barriers to entry for testing and validating evasion techniques across varied network environments. Future research could explore cases where adversaries have restricted or incomplete access to relevant traffic data, a scenario that would necessitate alternative methods for training effective evasion policies. Additionally, studies could investigate the impact of fine-tuning DRL agents with target-specific data when accessible, potentially increasing the precision of attack techniques. Such directions would enrich the understanding of how adversarial methods perform under diverse data availability conditions and further clarify the resilience of NIDS across network environments.

—

Single-iteration adversarial policy learning: In this study, we employ a single learning iteration between the adversary and the defender, wherein the adversary learns a (near-)optimal perturbation policy for evading ML-based NIDS detection. This approach provides a foundational framework for adversarial policy development, and it aligns with studies that explore attack methodologies in discrete timeframes. However, in real-world settings, both attackers and defenders are likely to continuously adapt their strategies in response to each other. Future research could incorporate a continuous or multi-iteration learning framework, in which adversarial agents adjust their evasion techniques over time as defenders enhance detection mechanisms in response. This dynamic framework could better reflect the ongoing adversarial arms race in cybersecurity. Furthermore, by analyzing how both attackers and defenders evolve within a closed-loop learning environment, researchers could identify more resilient, adaptive defense strategies that are capable of addressing evolving threat landscapes.

References

[1]

James Aiken and Sandra Scott-Hayward. 2019. Investigating adversarial attacks against network intrusion detection systems in sdns. In Proceedings of the 2019 IEEE Conference on Network Function Virtualization and Software Defined Networks. IEEE, 1–7.

Abstract

1 Introduction

2 Related Literature

2.1 Network Intrusion Detection System

2.2 Adversarial Attacks on ML Models

2.3 DRL Approaches in Security and Other Fields

3 Adversarial Network Packet Generation Framework: Deep Packgen

3.1 Data Set Creation

3.2 Packet Classification Model Development

3.3 Adversarial Sample Generation

3.3.1 Problem Definition.

3.3.2 Problem Formulation.

3.4 Side Effects of Packet Mutation

3.4.1 DRL-based Solution Approach.

4 Numerical Experiments

4.1 Data Description

4.2 Packet Classification Model Creation

4.3 Adversarial DRL Agent Training

5 Results and Analysis

5.1 Performance Evaluation of the Trained DRL Agents

5.2 Performance Analysis of the Agents

5.3 Performance Comparison Against a Genetic Algorithm-based Approach from the Literature

5.4 Statistical Analysis of the Successful Adversarial Packets

6 Conclusions and Future Directions

6.1 Key Insights from the Study

6.2 Research Scope and Pathways for Future Exploration

References

Index Terms

Recommendations

A novel method for improving the robustness of deep learning-based malware detectors against adversarial attacks

Preventing Adversarial Attacks Against Deep Learning-Based Intrusion Detection System

Secure deep neural networks using adversarial image generation and training with Noise-GAN

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

Login options

Full Access

Share

Share this Publication link

Share on social media

Affiliations