1 Introduction
A network intrusion detection system (NIDS) is a primary tool for cybersecurity operations centers (CSOCs) to detect cyber-attacks on computer networks. With the availability of high-performance computing resources and advancements in artificial intelligence (AI) and machine learning (ML) algorithms, intrusion detection mechanisms have greatly improved, serving the security needs of organizations. However, adversaries are also continuously advancing their toolchains by using AI/ML-enabled methodologies to camouflage their attacks that can evade these ML-based NIDS. Hence, the CSOCs must improve their security posture by proactively preparing for evasion attacks and making their NIDS robust against evolving adversaries.
Evasion attacks on NIDS are mainly conducted by perturbing network flow-based features to deceive ML models. Table
1 shows a summary of recent studies that focused on adversarial sample generation to evade NIDS. However, flow-based attacks are impractical as reverse engineering these perturbations from the flow level into constructing the actual packets is very complex and difficult [
41]. In addition, hidden correlations among different flow-based features further exacerbate the computational difficulty of replaying perturbations in a real network communication [
15]. More importantly, perturbations must be made such that the communication’s functionality is maintained. Hence, crafting adversarial attacks at the packet level is necessary to improve the practicality of implementing evasion attacks.
A few studies in recent literature have focused on using packet-based data to construct evasion attacks [
8,
15,
47]. These studies have utilized publicly available data sets to obtain the samples for obfuscation and relied on making random perturbations using trial-and-error and other approximation techniques. The generated adversarial samples were then tested against linear, tree-based, and nonlinear ML models for evasion. However, a significant drawback in most of these studies is that, even though they consider or collect the packet level data, the perturbations they make are applied to extracted or aggregated packet information, rather than the raw packets themselves. To the best of our knowledge, Chale et al. [
8] is the only work in the literature that attempts perturbations on the raw packet itself, bypassing any feature extraction. The limitations of these studies in the literature are summarized as follows. The perturbations made to the samples were mainly focused on the time-based features, which a classifier can be made immune to by training it with raw packet information. Some also generate adversarial samples using packet or payload injection causing packet damage. However, there exists a correlation among the packet-level features, directly impacting the feature set of the classifier, which is not considered in any of these studies. This phenomenon is also known as the side effect of packet mutation [
37]. Another limitation of existing packet-based approaches is that they perturb both forward and backward packets (i.e., communication from the host to the destination and then the destination back to the host). Clearly, an adversary can only control the forward packets, those originating from the host and going to the destination (server).
Our proposed methodological framework addresses the above limitations in the following ways. Our methodology uses a learning-based approach, in which an AI agent is trained to make (near-)optimal perturbations to any given malicious packet. The agent learns to make these perturbations in a sequential manner using a
deep reinforcement learning (
DRL) approach. We identify the forward packets in network communication and only modify them to produce adversarial samples. We evaluate our adversarial samples against classifiers trained using packet-level data. We aim to make minimal and valid perturbations to the original packets that preserve the functionality of the communication. Examples of such perturbations include modifications to the valid portions in the
internet protocol (
IP) header,
transmission control protocol (
TCP) header, TCP options, and segment data. Furthermore, we only consider perturbing those features that can be obtained from the raw packet capture (PCAP) files without feature engineering. This makes it practical to replicate the attack using perturbed packets. We consider the side effects of packet mutation in this study. For example, any change to the IP or TCP header affects the IP and TCP checksum, respectively. We formalize the problem of side effects of packet mutation in Section
3.4 and provide a detailed description of the perturbations and their side effects in Section
4.3. We also evaluate whether the learning attained from one environment is transferable to another. We do this to gauge the effectiveness of our methodology in real-world settings where adversaries may not have any knowledge of the ML models and the data used to build the NIDS. We demonstrate the playability of the adversarial packet in a flow using the Wireshark application in the results Section
5 of the article. In summary, our article addresses the literature gap for constructing adversarial samples by developing a learning-based methodology with the following characteristics: only the forward packets are perturbed; valid perturbations are considered in order to maintain the functionality of the packets; side effects of perturbations are taken into account; effectiveness of the adversarial agents is tested against unseen classifiers; and demonstrated transferability of the framework to other network environment.
There are several contributions to this research study. The primary contribution is the development of a DRL-enabled methodology capable of generating adversarial network packets for evasion attacks on ML-based NIDS. Our methodological framework, Deep PackGen, takes raw network packets as inputs and generates adversarial samples camouflaged as benign packets. The DRL agent in this framework learns the (near-)optimal policy of perturbations that can be applied to a given malicious network packet, constrained by maintaining its functionality while evading the classifier. To the best of our knowledge, this is the first research study that poses the constrained network packet perturbation problem as a sequential decision-making problem and solves it using a DRL approach. Another novel aspect of this research is creating a packet-based approach to developing classification models for ML-based NIDS. The unidirectional (forward) packets from raw PCAP files are extracted and processed for machine computation. The transformed network packets are then used to train the classifiers. Other contributions highly relevant to the cybersecurity research community include the insights obtained from the experiments and their analyses. Our investigation reveals that our methodology can generate out-of-distribution (OOD) packets that can also evade the decision boundaries of more complex nonlinear classifiers. Furthermore, we also explain why packets of certain attack types can be easily manipulated compared to others. The knowledge gained from this study on the adversary’s ability to make specific perturbations to different types of malicious packets can be used by the CSOCs to defend against the evolving adversarial attacks.
The rest of the article is organized as follows. In Section
2, we present related literature pertaining to different types of intrusion detection mechanisms and adversarial attacks on ML models. We also present an overview of DRL approaches used in security and other application domains. Section
3 describes the DRL-enabled Deep PackGen framework for adversarial network packet generation. The data set creation process, packet classification model development, and the DRL solution approach are explained in this section. Section
4 discusses the numerical experiments conducted in this study. The performance of our methodological framework on publicly available data sets, the analysis of DRL agent’s policies, and the statistical analysis of the adversarial samples are presented in Section
5. Section
6 presents the insights obtained from this research study, along with the conclusions and future work.
4 Numerical Experiments
In this section, we outline the numerical experiments performed to evaluate our Deep PackGen methodology. We first discuss the experimental data, followed by the creation of ML-based packet classification models. Finally, we delve into the hyperparameters employed during the training and testing of the DRL agent. The goal is to train multiple DRL agents, each specifically designed to generate packets for a unique attack type, by interacting with a surrogate model specialized in identifying that type of attack. To achieve this, we generated several data sets for training and testing of the DRL agents.
4.1 Data Description
We conducted numerical experiments using raw PCAP files from two popular network intrusion detection data sets: CIC-IDS2017 [
45] and CIC-IDS2018 [
46]. These data sets contain both benign and attack communications, and provide a pragmatic representation of modern network traffic compared to older data sets like NSL-KDD and KDD-CUP [
19]. Additionally, the availability of raw PCAP files for the CICIDS data sets reduces dependency on extracted flow level features [
42]. CIC-IDS2017 consists of PCAP files for five consecutive days (Monday to Friday), each with different attack types and sizes, as shown in Table
2. We processed these files to generate the data set, as explained in Section
3. Table
3 displays the number of forward packets extracted for each attack and its subtypes.
To ensure balanced data sets for our experiments, we followed these steps: First, we identified and removed attack types and subtypes with low flow instance counts to facilitate efficient training of packet classification models and DRL agents. Specifically, we excluded the Heartbleed and Botnet attack types, as well as the SQL Injection subtype from the Web Attack category. Next, we downsampled the flow instances of Denial of Service (DoS), Distributed Denial of Service (DDoS), and Port Scan attacks to achieve a near-equal representation of packets across all attack types. Due to substantial differences in both the structural characteristics and execution methodologies of various attacks, we trained individual DRL agents to specialize in distinct attack types. Each DRL agent underwent training using a surrogate packet classification model, developed with an equal number of benign and attack packets for the respective attack type. In constructing the surrogate classifiers, we employed a random selection process to choose attack samples from each attack type, along with an equal number of samples from benign communications. As discussed earlier, we only considered forward packets in our data set as an adversary is in control (generation and manipulation) of packets that originate from its source. We extracted payload bytes from each packet and represented each byte as a feature in this data set. We converted hexadecimal numbers to decimal numbers and normalized each feature value to a range of 0–1, where the minimum and maximum feature values were 0 and 255, respectively. In total, there were 1,525 features.
We utilized the CIC-IDS2017 data set for training and testing our framework as follows. We divided the data into different attack types, including samples from the benign category. For each attack type, we split the data into three parts: 60% of data for training the DRL agent and building the surrogate model for the training phase, 30% of data for building other packet classification models for testing the trained DRL agent, and 10% of data for generating adversarial samples and performing evaluation in the testing phase. In the rest of the article, we will refer to them as training, ML model testing 1, and DRL agent testing 1 data sets, respectively, for each attack type.
Further, to evaluate the performance of the trained agents on different network traffic data, we utilized the CIC-IDS2018 data set. Table
4 shows the different attack types and sizes of the PCAP files in the CIC-IDS2018 data set. We extracted packets for the various attack types, including DoS, Web Attack, Infiltration, Port Scan, and DDoS from these files. Table
3 displays the number of forward packets for them. We follow the same process of balancing the instances of attacks across all attack types, as was done with the CIC-IDS2017 data set. We split the data for each attack type into two parts: 70% of data was allocated for building the ML models for the detection of the respective attack type in the testing phase and the remaining 30% of data was utilized to measure the ASR of the respective trained DRL agent. We refer to these two parts as
ML model testing 2 and
DRL agent testing 2 data sets, respectively. Figure
5 shows a schematic of the above-mentioned data splitting strategy from the two CICIDS data sources and their respective uses in the training and testing phases of the framework. Note that the DRL agents were not trained with the CIC-IDS2018 data samples.
4.2 Packet Classification Model Creation
We developed different sets of ML models for both training and testing the DRL agents. Data sets were carefully prepared for developing the surrogate (training phase) and the testing (testing phase) models. ML models in the training phase were developed using the
training data. In contrast, those used for testing the trained DRL agents were developed using either the
ML model testing 1 or
ML model testing 2 data sets (see Figure
5). A DRL agent was trained to perturb packets for each attack type. To effectively train the agent to deceive the classifier’s decision boundary, we used a surrogate model specializing in that respective attack type in the agent’s training environment. For example, if the DRL agent was being trained on perturbing the packets of a
Port Scan attack, then the surrogate model was trained with the forward packets extracted from the network flow data of the same attack.
In the training phase, we selected an ensemble of ML models to act as a surrogate model. We randomly sampled 80% of the
training data to train various ML models, including linear, tree-based, and nonlinear classification models. The performances of all these models on the remaining 20% of the
training data were comparable across each attack type, with the majority of them having a superior accuracy of around 99%. We selected one model from each of the three types of classifiers in the ensemble:
logistic regression (
LR), DT, and
multi-layer perceptron (
MLP). Table
5 shows the accuracy scores of these models that form the ensemble in the training environment.
Similarly, ML models were developed for the testing phase. Two sets of models were trained: one using the
ML model testing 1 data set and another using the
ML model testing 2 data set. Note that both these data sets contain previously unseen samples by the DRL agents. In addition, the latter contains samples from a different network than that used to train the agents. We used a similar split of 80% on these data sets to train the respective sets of models. The hyperparameters used for training the various models, including
random forest (
RF), DNN, and
support vector machine (
SVM), among others, are shown in Table
6. The values of some of the hyperparameters were adopted from literature [
12,
21], and others were experimentally determined. The testing accuracy scores of a sample list of models for each attack type on the remaining 20% of the respective data sets are shown in Table
7. All these models performed very well in accurately differentiating samples between the benign and malicious classes.
4.3 Adversarial DRL Agent Training
The state space of the DRL agent consists of the 1,525 features extracted from the network packet and its classification label. As discussed in the data set creation component (Section
3.1) of the framework, these features are the normalized values of the bytes pertaining to different TCP/IP header and segment information (see Figure
2). Our focus in this study is to find the (near-)optimal set of perturbations that can be applied to a given malicious network packet to generate a successful adversarial sample, while maintaining the functionality of communication. To show the effectiveness of our methodology, we selected a set of valid perturbations (
\(\Delta\)) based on domain knowledge and insights from literature studies [
1,
3,
14,
15,
23,
32,
34,
43,
50,
59]. Below is a list of perturbation categories we selected as a part of the agents’ action space along with their descriptions and side effects.
—
Modifying the fragmentation bytes from do not fragment to do fragment.
—
Description: This perturbation can be applied to packets in which fragmentation is turned off. The do fragment command takes the hexadecimal value 00. Turning this flag on breaks the packet into multiple parts, however, no payload information from the packet is deleted or modified.
—
Side effects: This perturbation directly affects byte numbers 7 and 8, and indirectly affects byte numbers 9, 11, and 12 of the IP header, where byte 9 represents the
time to live (
TTL) value, and bytes 11 and 12 represent the IP checksum value. IP checksum can be calculated by adding all the elements present in the IP header skipping only the checksum bytes [
40]. The checksum value changes with the change in the byte value of the fragmentation bytes. The TTL value is also adjusted as a result of fragmentation [
38].
—
Modifying the fragmentation bytes from do not fragment to more fragment.
—
Description: This perturbation can be applied to packets where fragmentation is turned on or off. The more fragment command takes the hexadecimal value 20. Turning this flag on signifies there are more fragments incoming from the same flow. This also does not alter any of the packet information to be transmitted.
—
Side effects: This perturbation directly affects byte numbers 7 and 8, and indirectly affects byte numbers 9, 11, and 12 of the IP header.
—
Increasing or decreasing the TTL byte value.
—
Description: Any valid perturbation to this byte will result in a final TTL value between 1 and 255. If the TTL value is decreased significantly then it might cause the loss of functionality. Hence, we change this value by a small amount (+/-1) to make sure the TTL value does not change substantially.
—
Side effects: This perturbation directly affects byte number 9, and indirectly affects bytes 11 and 12 of the IP header.
—
Increasing or decreasing the window size bytes.
—
Description: Any valid perturbation to these bytes will result in a final window size value between 1 and 65,535. The TCP operates as a connection-oriented protocol, wherein meticulous tracking of transmitted data occurs. The sender transmits data, necessitating acknowledgment from the receiver. In instances where acknowledgment is not received within the stipulated timeframe, the sender initiates the retransmission of data. TCP incorporates a mechanism known as windowing, wherein a sender dispatches one or more data segments, and the receiver acknowledges either a single segment or multiple segments. At the initiation of a TCP connection, hosts employ a receive buffer to temporarily store data until it can be processed by the application. Upon the receiver issuing an acknowledgment, it conveys to the sender the amount of data that can be transmitted before another acknowledgment is expected. This parameter is termed the window size, serving as an indicator of the receive buffer’s capacity. We change this value by a small amount so that it does not alter the functionality or maliciousness of the packet.
—
Side effects: This perturbation directly impacts byte numbers 15 and 16 of the TCP header, and indirectly impacts byte 17 and 18 of the TCP header, which represent the TCP checksum, similar to IP checksum [
7,
40].
—
Adding, increasing, or decreasing the maximum segment size (MSS) value.
—
Description: This perturbation can only be applied to SYN and SYN-ACK packets. For packets that already have the MSS option, we only increase/decrease the value. The MSS value is limited between 0–65535. The TCP options do not have a specific order, and the MSS options have a length of 2 or 4 bytes. The TCP MSS option delineates the maximum size of a TCP segment that a given TCP endpoint is capable of accommodating. This option is exchanged by both communicating parties at the initiation of a connection. Notably, the MSS option serves to impose a constraint on the quantity of data permissible in each individual TCP segment. We modify the MSS value by either increasing or decreasing it by +/-1. Typically, this value is set to 1,460 bytes. Therefore, when decreasing this feature, we ensure that the MSS value does not decrease substantially below 1,460.
—
Side effects: This perturbation also indirectly affects byte numbers 17 and 18 of the TCP header.
—
Adding, increasing, or decreasing the window scale value.
—
Description: Window scaling is used to expand the window size beyond 65,535 bytes. It multiplies the window size to increase the available buffer’s size. The window scale value is limited between 0–14. The window scale options have a length of 1 or 2 bytes. This perturbation can only be applied to SYN and SYN-ACK packets. For packets that do not have the window scale by default, we add the window scaling to the SYN or SYN-ACK packets, while for packets that already have the window scale option, we only increase/decrease the value. Tweaking the window scaling does not impact the payload information resulting in no loss of functionality or maliciousness.
—
Side effects: This perturbation also indirectly affects byte numbers 17 and 18 of the TCP header.
—
Adding segment information.
—
Description: For this perturbation, we selected the most commonly occurring TCP payload information in the data set. Each time this action is chosen, a portion of the above TCP payload is sequentially added as dead bytes to the end of the malicious packet’s TCP payload information. Since we add the dummy payload at the end of the segment and do not modify information that is already in the attack packets, the functionality of the perturbed packets remains intact [
15]. Note that this dummy payload can be carefully crafted by an adversary in a real-world scenario to satisfy certain structural and formatting requirements of the desired attack. For example, an adversary can add multiple HTTP parameters with the same name to bypass input validation or modify internal variable values in an HTTP parameter pollution exploitation. The adversary can then trigger an XSS attack by passing an empty string with an extra argument [
32].
—
Side effects: This perturbation affects byte values in the TCP segment and indirectly affects byte numbers 17 and 18 of the TCP header.
We tried various reward schemes with different values of positive and negative rewards in our reward function (see Equation (
6)). We obtained the best results, in terms of higher average reward value and a faster convergence with the reward term values as follows. We assigned a value of 200 to
\(r^+\) and -2 to
\(r^-\). If the perturbed sample successfully evaded all three classifiers in the ensemble model, then
\(r_t=600\) was passed on to the DRL agent. If the sample successfully evaded only two (one) of the three classifiers, then the DRL agent received a reward of 400 (200). However, if the sample failed to evade any of the classifiers, then a negative reward of
\(r_t=-2\) was assigned to the agent at time
t.
We conducted the experiments on a machine with 12
th Generation Intel Core i9-12950HX processor (30 MB cache, 24 threads, 16 cores) with NVIDIA RTX A5500 graphics card (16GB GDDR6 SDRAM). Table
8 shows the different hyperparameter values used in Algorithm
2 for training the DRL agent. We performed controlled exploration by employing an
\(\epsilon\)-greedy exploration approach with exponentially decaying value of
\(\epsilon\). The policy network and the target network implemented in the experiments are similar, and the latter is updated with the policy network weights every 10 time-steps. To expedite convergence, Kaiming normal initialization is used instead of random initialization for neural network weights [
18]. The maximum number of training episodes is chosen based on the improvement curve of the moving average of episodic rewards observed during DRL agent training. Figure
6 depicts one such moving average curve during the training of a DRL agent aimed at generating adversarial packets using the
training data samples for the Port Scan attack type. We noted that the average reward plateaus around 50 K episodes across all attack types during the training phase.
6 Conclusions and Future Directions
In this article, we presented the development of a generalized methodology for creating adversarial network packets that can evade ML-based NIDS. The methodology is aimed at finding (near-)optimal perturbations that can be made to malicious network packets while evading detection and retaining functionality for communication. We posed this constrained packet perturbation problem as a sequential decision-making problem and solved it using a DRL approach. The DRL-enabled solution framework, Deep PackGen, consists of three main components, namely packet-based data set creation, ML-based packet classification model development, and DRL-based adversarial sample generation. Raw packet capture files from publicly available data were used to conduct the experiments. The framework generated curated data sets containing forward network packets, which were used to train and test five different types of DRL agents. Each agent was tailored to a specific attack type and evaluated on various classifiers. Results show that the Deep PackGen framework is successful in producing adversarial packets with an average ASR of 66.4% across all the classifiers in the network environment in which they were trained. The experimental results also show that the trained DRL agents produce an average ASR of 39.8% across various tree-based and nonlinear models in a different network environment.
6.1 Key Insights from the Study
Below, we present a summary of the insights obtained from this study that can guide future investigations.
—
The DRL agents have a higher success rate in evading tree-based packet classification models like DT and RF compared to nonlinear classifiers such as SVM and DNN.
—
The success rate of the DRL agents in generating adversarial samples is directly related to the key features that govern the decision boundary of the classifier and whether these features could be changed without disrupting the packet’s communication function.
—
Attacks that have feature values similar to those of benign traffic, such as DDoS and Port Scan, are more vulnerable to successful perturbation by an adversarial agent.
—
The more complex the decision boundary of the classifier, the larger the magnitude of the perturbation required for evasion, resulting in OOD samples.
—
The policies learned by the DRL agents are transferable to new network environments.
6.2 Research Scope and Pathways for Future Exploration
While our work addresses a specific set of objectives and parameters, below we provide a broader context for this study and highlight promising areas for future research and collaboration between academia and industry.
—
ML-based NIDS focus: Our framework is designed specifically for evading ML-based NIDS classifiers, an important type of NIDS increasingly adopted in dynamic cybersecurity environments. These classifiers leverage advanced algorithms to detect anomalous behaviors that may evade traditional signature-based systems. While ML-based systems present a unique target for adversarial attacks, the insights and methodologies developed in this study could be extended to signature-based and commercial NIDS platforms. However, these systems operate on fundamentally different principles, requiring distinct attack and evaluation strategies. Expanding our framework to assess their vulnerabilities would provide a fuller understanding of how various types of NIDS respond to adversarial techniques. This work could have significant implications for practical applications and can be best undertaken in collaboration with industry experts who have insights into the specific configurations and limitations of commercial NIDS products. Through this collaborative research, the cybersecurity community could make further progress toward developing robust, adaptable defenses against diverse intrusion tactics.
—
End-to-end attack success: Achieving end-to-end success for an adversarial attack within a live network context introduces additional complexities, as the mutated packets must also maintain intended malicious behavior across various network components, including routers, operating system kernels, and application layers. Our study relies on domain knowledge and established research to make reasoned assumptions that the crafted perturbations would preserve malicious functionality, which we validated using packet integrity checks in Wireshark. In a production environment, however, achieving consistent end-to-end attack success may require a more extensive validation framework, which can confirm packet integrity as well as full operational impact of adversarial actions across different network elements. Developing such an oracle could be instrumental for assessing and refining adversarial techniques across diverse system architectures and configurations, ultimately providing a more complete understanding of the real-world viability of these attacks.
—
Availability of training data: Our study assumes that an adversary can access network traffic data, either from publicly available sources or via synthetic generation, for the purpose of training. This assumption is based on the reality that network packets are generally structured according to standardized protocol regulations, meaning that packet formats and certain traffic patterns are consistent across organizations. This common structure allows adversarial techniques to be generalized beyond the confines of any specific network, enhancing the transferability of the attack policies generated by our DRL agent. Moreover, this standardization lowers barriers to entry for testing and validating evasion techniques across varied network environments. Future research could explore cases where adversaries have restricted or incomplete access to relevant traffic data, a scenario that would necessitate alternative methods for training effective evasion policies. Additionally, studies could investigate the impact of fine-tuning DRL agents with target-specific data when accessible, potentially increasing the precision of attack techniques. Such directions would enrich the understanding of how adversarial methods perform under diverse data availability conditions and further clarify the resilience of NIDS across network environments.
—
Single-iteration adversarial policy learning: In this study, we employ a single learning iteration between the adversary and the defender, wherein the adversary learns a (near-)optimal perturbation policy for evading ML-based NIDS detection. This approach provides a foundational framework for adversarial policy development, and it aligns with studies that explore attack methodologies in discrete timeframes. However, in real-world settings, both attackers and defenders are likely to continuously adapt their strategies in response to each other. Future research could incorporate a continuous or multi-iteration learning framework, in which adversarial agents adjust their evasion techniques over time as defenders enhance detection mechanisms in response. This dynamic framework could better reflect the ongoing adversarial arms race in cybersecurity. Furthermore, by analyzing how both attackers and defenders evolve within a closed-loop learning environment, researchers could identify more resilient, adaptive defense strategies that are capable of addressing evolving threat landscapes.