- Research
- Open access
- Published:
A proactive defense method against eavesdropping attack in SDN-based storage environment
Cybersecurity volume 7, Article number: 58 (2024)
Abstract
The integration of Software-Defined Networking (SDN) in storage centers aims to enhance storage performance. However, this integration also introduces new concerns, particularly the potential eavesdropping attacks that pose a substantial risk to data privacy. By issuing flow tables (e.g., via compromised SDN switches), attackers can conveniently collect target traffic and extract confidential information with session reassembly methods. To proactively mitigate such attacks by preventing session reassembly, various moving target defense methods, such as end hopping, have been proposed. However, this study uncovers several deficiencies within existing end hopping methods. To address these deficiencies, we propose a novel linkage-field-based self-synchronizing end hopping method, which obfuscates end information (e.g., IP, Port) and linkage fields (e.g., sequence number and ID number) without third-party assistance. Furthermore, to counter the potential invalidation of end hopping methods resulting from brute-force reassembly of a small number of sessions, we propose a fake segment injection method. Extensive experiments have been conducted both in simulation and real-world environment to evaluate the effectiveness of our proposed methods. The results demonstrate that our proposed methods can effectively defend against eavesdropping attacks with acceptable performance overhead.
Introduction
The SDN architecture enables dynamic configuration of forwarding rules via programming, which can facilitate network deployment and management. These advantages have led to remarkable performance improvement in storage systems deployed on SDN (Handley et al. 2017; Saha et al. 2017; Guillen et al. 2018). However, the prominent features also create opportunities for attacks. Attackers can initiate eavesdropping attacks by compromising switches and issuing flow tables (Akhunzada et al. 2015; Yu et al. 2021). Combining this with session reassembly methods, they can further extract confidential information from the collected traffic (Moghaddam and Mosenia 2019). Moreover, the attacks initiated in the storage center have a greater impact than those launched on a single router or link. Therefore, addressing this issue is of utmost importance.
Conventional privacy preserving scheme for transient data mainly focus on encryption (Trabelsi et al. 2004; Khan et al. 2012), which prevents attackers from accessing the content. However, encryption schemes for transient data are inefficient for storage systems. Additionally, terminal-based anti-sniffing methods (Gregorczyk et al. 2020; Kher and Kim 2005; Alneyadi et al. 2016) cannot identify eavesdropping attacks at the network layer. Hence, our focus is on privacy preservation from the network layer, specifically employing moving target defense methods to prevent eavesdropping attacks by interfering with session reassembly procedures, as demonstrated in Govil et al. (2020) and Datta et al. (2019).
Typical moving target defense methods against eavesdropping attacks include path hopping, end hopping and a combination of both. Path hopping involves delivering segments through different paths to prevent attackers from capturing all of them. End hopping obfuscates the end information (such as IP, Port) of segments to prevent session reassembly. Based on synchronizing rules, end hopping methods can be further categorized into reconfiguration-based end hopping (RCEH), time-synchronizing end hopping (TSEH) and packet-based self-synchronizing end hopping (PSEH). However, existing hopping methods still have deficiencies owing to the following reasons: (1) limitation of topology. Path hopping method relies on extensive alternative paths to deliver segments, but attackers can obtain most segments in a topology (e.g., Fat-Tree) constructed by limited paths (Zhou et al. 2021; Liu et al. 2021). (2) Limitation of hopping frequency. In RCEH method (Jafarian et al. 2012, 2014; MacFarland and Shue 2015; Skowyra et al. 2016; Wang et al. 2018; Moghaddam and Mosenia 2019; Govil et al. 2020), the hopping interval is usually large (e.g., 60 s Skowyra et al. 2016) for system stability. However, this results in limited obfuscation of segments. (3) Deficiencies of synchronizing modules. In TSEH method, systems requires precise time synchronization (Luo et al. 2017; Chang et al. 2018; Chen et al. 2020; Wang et al. 2021; Cunha et al. 2021). However, the synchronization module can introduce new attack surfaces (DeCusatis et al. 2019; Luo et al. 2017). (4) Ignorance of linkage fields. Most end hopping methods ignored certain linkage fields that can be exploited for session reassembly. (5) Ignorance of brute-force reassembly of a small number of sessions. Brute-force reassembly of a small number of sessions (e.g., a single session) is not a challenging task and can render all the end hopping methods ineffective, as detailed in Sect. 4.
To address these deficiencies, we propose a novel linkage-field based self-synchronizing end hopping method (LSEH), which obfuscates end information (e.g., IP, Port) and linkage fields (e.g., sequence number and ID number) without third-party assistance. Compared to existing end hopping methods, LSEH covers the necessary linkage fields, making it more secure. Additionally, it demonstrates higher computational efficiency. On the basis, we propose a fake segment injection method (FSI) that utilizes the SDN controller to inject fake segments into each domain, so as to mitigate brute-force reassembly of a small number of sessions. The contributions of this study are as follows:
-
(1)
For the first time, we theoretically and experimentally demonstrate the deficiencies of existing end hopping methods that ignore obfuscating certain linkage fields and brute-force reassembly of a small number of sessions. To address these issues, we propose the linkage-field based self-synchronizing end hopping method and the fake segment injection method.
-
(2)
We introduce the use of permutation entropy to evaluate the obfuscation level of linkage fields, and define the security gain to evaluate the security enhancement of the proposed methods. These metrics provide valuable insights for evaluating the efficacy of end hopping methods.
-
(3)
To evaluate the effectiveness of our proposed methods, we have conducted extensive experiments in both simulation and real-world environment. The results reveal that our proposed methods can effectively defend against eavesdropping attacks in SDN-based storage environments with acceptable performance overhead.
Related work
Typical moving target defense methods against eavesdropping attacks include path hopping, end hopping and a combination of both. In this section, we will discuss the related works on each of these methods.
Path hopping
Duan et al. (2013) firstly proposed a random routing mutation mechanism (RRM) that utilizes a condition restraint model to represent the path selection problem and solve it with satisfiability modulo theories (SMT). This model enables to quickly select a satisfied path under the premise of ensuring end-to-end QoS. Aseeri et al. (2017) introduced a bidirectional RRM to prevent attackers from eavesdropping retransmitted packets that result from dropped ACK responses. Zhang et al. (2019) utilized the Q-learning algorithm to train the path selection model, enabling it with security awareness. Considering RRM’s capability and an attacker’s long-term background knowledge, Xu et al. (2021) established a stochastic optimization model to solve the path selection problem to improve performance. Zhou et al. (2021) proposed an adaptive RRM method called AMS, which measures link delay using In-band Network Telemetry (INT) technology and schedules sub-flows (series segments of a session) based on the measured results to alleviate disordered segments. Xu et al. (2022) achieved fine-grained mutation by implementing RRM based on Deep Deterministic Policy Gradient (DDPG), which utilizes INT technology to measure network state and schedule each segment according to the training results.
However, in SDN-based network, RRM methods exhibit significant drawbacks. Firstly, RRM can result in the occurrence of out-of-order segments, which leads to increased transmission delay and performance degradation. Secondly, in topologies with limited alternative end-to-end paths, such as Fat-Tree, attackers can still obtain most segments from aggregation switches. Therefore, the path hopping method is not suitable for mitigating eavesdropping attacks in storage systems, and this study focuses solely on end hopping methods that are not restricted by network topologies.
End hopping
In RCEH method, OF-RHM (Jafarian et al. 2012, 2014) periodically reconfigures the mapping of real and virtual IP addresses to hide the real address of hosts or services, thereby mitigating reconnaissance attacks. On the basis, MultiRHM (Jafarian and Niakanlahiji 2023) anonymizes identifying attributes of hosts over different dimensions (time and space). However, MacFarland and Shue (2015) pointed out that OF-RHM consumes excessive flow entries. They suggested that combining SDN with proxies to address this issue. Based on the idea, PHEAR (Skowyra et al. 2016) and U-TRI (Wang et al. 2018) utilize proxies to periodically reconfigure fingerprinting fields (e.g., IP ID, TTL, etc.). PANEL (Moghaddam and Mosenia 2019) is implemented based on P4 switches and can provide a practical light-weight anonymity solution by obfuscating linkage fields, such as TCP sequence number and IP ID number. MIMIQ (Govil et al. 2020) implemented an end hopping method based on QUIC, which acts as a hopping protocol in the transport layer. To defend against service-oriented man-in-the-middle attack in Kubernetes, Ma et al. (2023) proposed a proactive defense mechanism that involves the address mutation and the connection ID mutation module. However, the hopping frequency of RCEH is too low to effectively prevent session reassembly.
In TSEH method, RPAH (Luo et al. 2017) uses system time as a nonce to calculate PRF (pseudo random function, such as HMAC-MD5 algorithm), which is then utilized to obfuscate IP and Port fields. Chang et al. (2018) uses IP address and timestamp as the parameters of PRF for IP obfuscation and recovery. Similarly, other literature (Chen et al. 2020; Wang et al. 2021; Cunha et al. 2021; Fan et al. 2022) uses timestamp and PRF for synchronization, albeit in distinct application environments. Notably, the time synchronizing module introduces a new attack surface (DeCusatis et al. 2019; Luo et al. 2017). Compromised time synchronizing modules can lead to packet retransmission and significant performance degradation.
In PSEH method, KHSS (Luo et al. 2017) uses the invariable part (e.g., payload) of a segment for synchronization. This self-synchronizing end hopping method can obfuscate each segment without third-party assistance. SPINE (Datta et al. 2019) extends IPv4 packets with the IPv6 protocol to embed a nonce into a modified packet header. The nonce is used for synchronization, specifically PRF calculation and obfuscation recovery. However, these methods remain vulnerable owing to the ignorance of certain linkage fields during obfuscation, as detailed in Sect. 3.
While the literature (Zhao et al. 2016; Zhang et al. 2017; Liu et al. 2021) attempts to combine path hopping with end hopping to enhance security, it has not overcome the limitations of these hopping methods. Particularly, P4NIS (Liu et al. 2021) incorporates encryption into end hopping to improve security. However, the performance overhead, with an average throughput of 4.24 Mbps, is unacceptable for storage systems. Furthermore, a small number of sessions can still be reassembled by brute-force, regardless of how the segments are obfuscated. Hence, it is crucial to introduce additional mechanisms to address this issue. For a comprehensive comparison of our research with related methods, please refer to Table 1.
Adversary model
Background and assumptions
Firstly, we outline the process of initiating an eavesdropping attack in a SDN-based storage environment. As shown in Fig. 1, attackers can compromise SDN switches to install specific flow entries to convey target data (Step 1). When the target segments pass through the compromised switches, they are either copied or encapsulated and then delivered through the leakage channel (Steps 2 and 3). By reassembling the obtained segments into completed sessions, the attackers can extract confidential information from them (Step 4). Moreover, in the context of decryption key leakage (Step 5), it can directly expose the plaintext information and result in heightened level of damage to data privacy. Subsequently, we present the principle of end hopping methods for mitigating this attack. In Fig. 1, the gateways are deployed with end hopping methods, thus a segment can be obfuscated and correctly recovered only if it passes through the gateways sequentially. Therefore, according to the eavesdropping attack process, the attackers can only obtain obfuscated segments and face the difficulties in reassembling them into complete sessions.
The assumptions of the adversary model are as follows: (1) Attackers can install specific flow entries, allowing them to initiate eavesdropping attacks and capture all the segments passing through SDN switches. Meanwhile, the entities involved in the storage system, including gateways, SDN controller and storage servers, are considered trusted, meaning that they are not compromised by attackers. (2) An attack can only be successful if the segments of a session can be successfully reassembled, while partial segments alone do not lead to data leakage. This assumption is based on the fact that in a storage system, files are divided into multiple blocks for storage, and each block is further divided into multiple segments for transmission. Hence, it is challenging for attackers to extract meaningful information from partial segments. This assumption also simplifies the establishment of our model, as we lack specific knowledge about the information carried by individual segments. Table 2 provides a list of major notations used in this study along with their corresponding descriptions.
Adversary model analysis
Firstly, we present the session reassembly process in Algorithm 1, which serves as the foundation for the subsequent adversary model. For the segments that are not obfuscated, the reassembly process is based on end information (Phase 1). For the segments that are partially obfuscated, the reassembly process is based on linkage fields (Phase 2), as depicted in Fig. 2. The time window T represents the interval for assigning a segment to its corresponding session. Since the successful reassembly of segments relies on the absence of conflicts among the values of linkage fields within T. Hence, the collision rate of linkage fields can be utilized to analyze the one-time success rate of the attack.
Subsequently, we establish the adversary model to analyze one-time success rate of the attack based on the collision rate of linkage fields. The collision on the values of linkage fields should satisfy two conditions: (1) The difference between the initial values of linkage fields for two sessions should be divided by M, which represents the increment size of linkage fields. (2) Within time window T, the values of linkage fields for one session should coincide with those of the other session. In a TCP/IPv4 protocol based storage system, the linkage fields include TCP sequence number and IP ID number. Assuming the transmission process of the sessions is independent and follows a homogeneous Poisson process, we can calculate the collision rate of TCP sequence number and IP ID number as follows.
For sessions \(s_i,s_j\left( 1 \le i \ne j \le N\right)\), where their initial TCP sequence numbers are \(e_i, e_j\left( 0 \le e_i< e_j < L_{seq}\right)\), we can calculate the difference of their values \(d_{ij}=e_j-e_i\) and \(d_{ji}=L_{seq}-e_j+e_i\)(in case sequence wraps around). Based on the above two conditions, we can derive Eq.(1) and Eq. (2) as the two sub-expression of the collision rate of TCP sequence numbers. Furthermore, since the values of \(e_{i}, e_{j}\) are unknown, we assume that \(d_{ij}\) follows a uniform distribution with the range \(\left[ 0,L_{seq}\right)\). Let \(\sigma _{ij}=\min \left( \lambda _iT, \lfloor \frac{k}{M}\rfloor \right) , \sigma _{ji}=\min \left( \lambda _jT, \lfloor \frac{L_{seq}-k}{M}\rfloor \right)\), where \(0 \le k < L_{seq}\), the collision rate of TCP sequence numbers can be obtained by \(Pr_{seq}\left( i,j\right) =Pr_{seq}^{ij}+Pr_{seq}^{ji}\).
For sessions \(s_i,s_j\left( 1 \le i \ne j \le N\right)\), where their initial IP ID numbers are \(e_i, e_j\left( 0 \le e_i< e_j < L_{id}\right)\), we can calculate the difference of initial values \(d_{ij}=e_j-e_i\) and \(d_{ji}=L_{id}-e_j+e_i\)(in case ID wraps around). Similarly, we can derive Eqs. (3) and (4) as the two sub-expression of the collision rate of IP ID number. Hence, the collision rate of IP ID numbers can be obtained by \(Pr_{id}\left( i,j\right) =Pr_{id}^{ij}+Pr_{id}^{ji}\). Notably, the increment size of IP ID number is 1.
If the attack goal is to reassemble a single session \(s_{i}\), the one-time success rate \(Pr_{atk}^i\) can be calculated by Eq. (5). While if the attack goal is to reassemble all the sessions, the one-time success rate \(Pr_{atk}\) can be calculated by Eq. (6). Notably, we implicitly assume that each segment is (partially) obfuscated. Therefore, this model presents a lower bound on the one-time success rate. In other words, for the end hopping methods (with a lower hopping frequency) that do not obfuscate each segment, the one-time success rate is higher.
Defensive model
System design
As shown in Fig. 3, the system is based on SDN architecture. The SDN controller contains three core modules: flow dispatcher module, key management module and segment injection module. The flow dispatcher module is responsible for installing forwarding rules, which can ensure correct forwarding of obfuscated segments. For instance, if the obfuscated IP address is 192.168.1.x, the corresponding forwarding rule would match a CIDR address 192.168.1.0/24. The key management module handles the dispatching and updating of hopping keys for gateways. The segment injection module periodically retrieves flow entry information from edge switches, and injects fake segments into each storage domain. Gateways receive configurations, such as hopping keys, from the controller. The gateways obfuscate and recover the segments as they pass through. Specifically, when a segment enters from the access network to the storage network, the external gateway obfuscates it. The obfuscated segment is then recovered when it passes through the corresponding internal gateway. Notably, gateways do not need to identify and filter injected segments, because the fake segments would be discarded by storage servers owing to malformed packet header, such as incorrect checksum fields.
Linkage-field based self-synchronizing end hopping method
Firstly, we present the linkage-field based self-synchronizing end hopping method (LSEH), as shown in Algorithm 2. The innovations of LSEH lie in the following two aspects: (1) similar to the system time used in TSEH, the values of linkage fields increase in a space modulo L as a session progresses. However, unlike system time, the increment of the sequence number is uncertain in reality, which introduces additional uncertainty to hopping results. (2) As unique identification of a segment, the linkage fields can be adopted for synchronization without third-party assistance. This feature can enhance the efficiency of PRF calculation and the robustness of the hopping method. Notably, in line 2 of Algorithm 2, the gateways obfuscate segments with different hopping keys, which are determined by domain id. In line 4–11, the end information and linkage fields are obfuscated by XOR operation. The mutation space \(\Omega\) is determined by the range of end information and linkage fields, specifically \(|\Omega _{id}|=|\Omega _{port}|=2^{16}, |\Omega _{seq}|=2^{32}\). As the IP address (e.g., 192.168.1.1) undergoes mutation within its subnet (e.g., 192.168.1.x), thus \(|\Omega _{ip}|=2^8\) correspondingly. Since the procedure of recovering obfuscated segments is the opposite, detailed illustrations are omitted.
Subsequently, we establish the defensive model to analyze the one-time success rate of the attack when LSEH is deployed. In the absence of access to the hopping keys, attackers encounter difficulties in correctly associating the segments with their corresponding sessions, as shown in Fig. 4. However, as depicted in Fig. 5, attackers may attempt to reassemble sessions by brute-force, trying to assign each segment to a session that it might belong to. In other words, the attacker is initially unaware of which obfuscated segment belongs to a specific session. However, given knowledge of the number of sessions, they can employ permutation and combination techniques to iteratively associate each segment with each session until successful session reassembly is achieved. Assuming that attackers can identify the start flag of a session and the number of sessions N within a storage domain, the probability of correctly assigning an immediate segment (excluding the start or end segments) to its corresponding session is \(\frac{1}{N}\). Meanwhile, the probabilities of correctly assigning an end segment to a session and all the sessions are \(\frac{1}{N}\) and \(\frac{1}{A_{N}^{N}}\), respectively. Therefore, if the attack goal is to reassemble a single session \(s_{i}\), the one-time success rate \(Pr_{atk'}^{i}\) can be calculated by Eq. (7), where \(t_{i}\) represents the duration of session \(s_{i}\). If the attack goal is to reassemble all the sessions, the one-time success rate \(Pr_{atk'}\) can be calculated by Eq. (8). In “Appendix”, we have proved that \(Pr_{atk'} \ll Pr_{atk}\) and \(Pr_{atk'}^i \ll Pr_{atk}^i\) when N is within a practical range, such as \(2 \le N \le 11048\) as obtained from the datasets. This conclusion reveals that the one-time success attack rate is significantly lower compared to the rate before the deployment of LSEH.
However, by examining Eq. (7) and Eq. (8), it is evident that a small value of N and G can lead to a high success rate. Denote Q as the crack capability, which represents the number of brute-force trials within a given time. It can inferred that attacks can be successful if \(Q * Pr_{atk'} \ge 1\). Although there is a lack of studies determining the value of Q, it is easy to identify a corner case where attacks can definitely succeed, regardless of the value of Q. This occurs when \(N=1\). In this event, all the end hopping methods become ineffective because \(Pr_{atk'} =1\).
Fake segment injection method
The division of server addresses within univ1_pt1 (the first part of the data center datasets UNI1 (Benson 2024)) into different C-class domains (e.g., x.x.x.0/24) is shown in Fig. 6. It reveals that the number of sessions (and segments) within these domains follows a heavy-tailed distribution, with the majority of domains containing only a small number of sessions. To mitigate brute-force reassembly of a small number of sessions, we propose the fake segments injection method (FSI), which utilizes the SDN controller to inject fake segments, such that increasing the complexity of session reassembly. Since a fake segment is constructed with fake end information and linkage fields, which are randomly selected from their mutation spaces, attackers are unable to distinguish between fake segments and normal segments, as depicted in Fig. 7. However, considering the limited capacity of the control channel, it is crucial to devise an efficient strategy for injecting a restricted number of segments while achieving higher security.
Firstly, we analyze the one-time success rate of the brute-force attack when both LSEH and FSI are deployed. According to previous analysis, owing to the presence of injected fake segments, attackers need to correctly assign each immediate segment to \(N+1\) sessions (including a fake session). Therefore, the probability of correctly assigning an immediate segment is \(\frac{1}{N+1}\). Denote the number of injected fake segments as c. If the attack goal is to reassemble \(s_{i}\), the one-time success rate \(Pr_{atk^*}^{i}\) can be calculated by Eq. (9), where \(\lambda ^{*}\) represents the injection rate. If the attack goal is to reassemble all the sessions, the one-time success rate \(Pr_{atk^*}\) can be calculated by Eq. (10). Compared with Eq. (7) and Eq. (8), the one-time success rate of the attack remains insignificant even when \(N=1\) and it declines exponentially, indicating that the security gains an exponential improvement.
Subsequently, we establish a model to determine the injection strategy. Define security gain J as the logarithm of the one-time success rate ratio between not deploying FSI and deploying FSI, as presented in Eq. (11). In this equation, \(c\ln \left( N+1\right)\) represents external gain, which partially depends on the value of c. While \(\left( G-2N\right) \ln \frac{N+1}{N}\) represents internal gain, which only depends on the value of N and G. This equation reveals that a larger value of c, N or G can lead to a larger security gain. Specifically, given the value of \(G_{i}\), \(N_{i}\), we can determine the number of injected segments for each domain \(\left\{ c_i\right\} _{i=1}^D\) by solving the model about \(J_{i}\).
Finally, we solve the model about \(J_{i}\) regarding with two goals: 1) According to the barrel principle, the most vulnerable domains should achieve the largest security gain, as defined in Eq. (12). 2) The total security gain should be maximized, as defined in Eq. (13). Considering the conditions \(\left\{ c_i\right\} _{i=1}^D \ge 1\) and \(\sum _{i=1}^D c_i \le C\) that represent the limited capacity of the control channel, we can solve these objective functions. Notably, they cannot be optimized simultaneously owing to their mutual restriction. Since an exact solution is not necessary, we adopt the \(\epsilon -constraint\) method to transform them into a single-objective optimization problem, as shown in formula (14), where \(\epsilon =\alpha \max \min \limits _{1 \le i \le D}J_i\left( 0.8 \le \alpha < 1\right)\), \(\alpha\) is set empirically. Notably, if the security gain for the crack capability \(J_{Q}\) is known, we have \(\epsilon =J_{Q}\). In summary, by solving the formula (14), we can obtain \(\left\{ c_i\right\} _{i=1}^D\) and determine the injection procedure in Algorithm 3.
Evaluation and analysis
In this section, we conduct experiments to evaluate the effectiveness of the adversary model and defensive models, as well as the performance overhead of the proposed methods. The experimental configuration is listed in Table 3.
Effectiveness evaluation
Evaluation of adversary model
The adversary model presents a method that relies on linkage fields to reassemble sessions. To verify its effectiveness in session reassembly, we evaluate the reassembly rate of the sessions that are extracted from each part of UNI1 (univ1_pt1–univ1_pt20). Specifically, we utilize Scapy (Biondi 2024) to implement Algorithm 1 and reassemble sessions based solely on TCP sequence number. Subsequently, we record the reassembly rate of both sessions and segments. The results depicted in Fig. 8 reveal that the majority of the sessions can be reassembled successfully. Therefore, the effectiveness of adversary model in session reassembly is verified. Notably, the reassembly rate of segments is relatively low owing to the exclusion of segments from incomplete sessions.
To verify the effectiveness of the adversary model in compromising the end hopping methods without obfuscating certain linkage fields, we evaluate the collision rate of the linkage fields, which serves as an indicator of the one-time success rate, as mentioned in Sect. 3. Since conducting numerous sessions for session reassembly in real-world environment is impractical, we utilize SimSharp (Beham 2024) (C# version of Simpy (Connell et al. 2018; Xiong et al. 2019), a discrete event simulation framework) to simulate session transmission and segment obfuscation. Afterward, we record the collision rate of linkage fields, such as TCP sequence numbers and IP ID numbers. The results for RCEH (hopping per 60 s (Skowyra et al. 2016)), TSEH (hopping per 5 s (Luo et al. 2017)) and PSEH (hopping per segment (Luo et al. 2017)), are shown in Figs. 9 and 10. From the results, we can observe that: (1) in accordance with Eqs. (5) and (6), a lower number of sessions and segments leads to a reduced collision rate and a higher one-time success rate. (2) The collision rate exhibits a positive correlation with the hopping frequency. In terms of security capability, we can establish the following relationships: RCEH < TSEH < PSEH. (3) Even for PSEH, which is considered more secure owing to segment-level hopping, the adversary model remains highly effective when the number of sessions is small. In summary, the results and observations provide evidence of the adversary model’s effectiveness and highlight the deficiencies of end hopping methods that do not obfuscate crucial linkage fields.
Evaluation of LSEH
The effectiveness of end hopping method is primarily determined by the level of obfuscation, which can be evaluated by statistical features (Pfitzmann and Hansen 2005). Therefore, we employ the Shannon entropy to evaluate the dispersion level of the obfuscated end information, such as IP and Port. Meanwhile, we propose to use the permutation entropy (Bandt and Pompe 2002) to evaluate the regularity of obfuscated linkage fields, such as TCP sequence number and IP ID number. The term “regularity” refers to the ordering of the linkage fields. Consider the two sequences \(S_{1}=[100, 101, 5, 6, 100]\) and \(S_{2}=[9, 5, 27, 97, 52]\), which are generated by the methods hopping every two items and each item, respectively. \(S_{1}\) exhibits more regularity and predictability compared to \(S_{2}\). Denote \(H_{s}\left( f\right)\) and \(H_{p}\left( f\right)\) as the Shannon entropy and permutation entropy of the sequence of obfuscated fields, respectively. It can observed that \(H_{s}\left( S_{1}\right) < H_{s}\left( S_{2}\right)\) and \(H_{p}\left( S_{1}\right) < H_{p}\left( S_{2}\right)\). Therefore, a larger value of \(H_{s}\left( f\right)\) or \(H_{p}\left( f\right)\) corresponds to a higher level of obfuscation.
Subsequently, we apply each method listed in Table 1 to the file transfer sessions of varying size (e.g., 10 MB, 100 MB and 1 GB), and calculate the normalized entropy. For Shannon entropy, the normalized result is calculated by \(H_{s}^{*}\left( f\right) =\frac{H_{s}{\left( f\right) }}{\log _{2}|\Omega _{f}|}\). For permutation entropy, the normalized result is calculated by \(H_{p}^{*}\left( f\right) =\frac{H_{p}{\left( f\right) }}{\log _{2} A_{m}^{m}}\), where m is the embedding dimension and set to 3 empirically (Bandt and Pompe 2002). Notably, conducting experiments on a single session can better demonstrate the capability of each method, as the results solely depend on the implementation of each method. The experimental results presented in Tables 4, 5 and 6 reveal that: (1) RCEH and TSEH achieve lower Shannon entropy owing to their lower hopping frequency. While they perform better for large files (e.g., 1 GB), they may still pose potential security risks for small files. (2) All PSEH methods achieve similar high entropy, even though they are implemented in different ways. (3) LSEH, in particular, is an implementation of PSEH that covers all essential linkage fields. It exhibits higher security compared to the related methods in terms of hopping frequency and consideration of linkage fields. Therefore, the effectiveness of LSEH is verified. Notably, owing to out-of-order packets resulting from retransmission, the permutation entropy of linkage fields can exceed 0 even without the deployment of hopping methods, as indicated by the “None” method in these tables.
Evaluation of FSI
Firstly, we evaluate the feasibility of FSI in terms of ensuring the undetectability of the injected fake segments. To achieve this, we inject varying proportion of fake segments into the obfuscated file transfer sessions, and observe the change of the statistical features of obfuscated fields. The experimental results are presented in Table 7. It can be observed that, as the injection ratio increases, the Shannon entropy of IP and Port fields exhibit a slightly increase, while the permutation entropy of linkage fields remains nearly unchanged. This observation indicates that the level of obfuscation does not diminish after injection. Therefore, it would be challenging for attackers to discriminate them from the normal segments, thus the feasibility of FSI is verified.
Subsequently, we verify the effectiveness of FSI by evaluating the security gain J. In order to calculate \(J_{i}\), we need to obtain the essential parameters, such as the number of domains D, the number of sessions \(N_{i}\) and segments \(G_{i}\) within each domain. To achieve this, we collect all the 11048 unidirectional sessions from univ_pt1. By extracting the two most representative B-class domains (41.177.0.0/16 and 244.3.0.0/16), which account for 83.8% of the sessions and 80.56% of the segments, we can obtain 18 C-class domains therein (\(D=18\)) and the number of sessions within them (\(\left\{ N_{i}\right\} _{i=1}^D=[3775, 3410, 563, 551, 457, 300, 54, 51, 42, 19, 14, 13, 3, 2, 1, 1, 1, 1]\)). For simplicity, we set \(G_{i}=1000N_{i}\), as the majority of the sessions within the datasets contain less than 1000 segments. Meanwhile, we set the injection rate to 100 segments per second, which is significantly below the capability of the control channel. As the sessions within the datasets last for 345 s, the number of total injected segments \(C=34500\). Finally, we utilize cvxpy (Diamond 2024) to solve the formula (14).
The security gain of each domain and the total security gain under varying injection mode are shown in Figs. 11 and 12, respectively. From Fig. 11, it can be observed that different injection modes yield varying security gains for the most vulnerable domain. Compared with the None and average injection modes, the optimal injection mode exhibits higher security gains for the most vulnerable domain, such as when \(\alpha\) is set to 0.8 or 0.9. From Fig. 12, it can be observed that the total security gains across domains differ for different injection modes. Compared with the None and average injection modes, the optimal injection mode achieves a higher total security gain. Additionally, (1) the initial security gain of each domain is low before injection. However, the security gain increases significantly after injection. (2) The security gain of the most vulnerable domain and the total security gain exhibit an inverse relationship, which indicates a mutual restriction between the objective functions defined in Eqs. (12) and (13). The change of security gain are shown in Figs. 13 and 14, respectively. It can be observed that: (1) As the number of segments within each domain increases, both the overall internal security gain and the security gain of the most vulnerable domain exhibit linear growth. Similarly, with the increase of the injection rate, both the overall external security gain and the security gain of the most vulnerable domain increase linearly. Moreover, the external security gain resulting from injection shows a more significant increase compared to the increase of the internal security gain. (2) As referred to the definition of the security gain, the one-time success rate declines exponentially with the linear increase of security gain. Therefore, the effectiveness of FSI is verified.
Performance evaluation
To evaluate the performance overhead of LSEH, we construct a physical SDN-based storage system, as shown in Fig. 15. In this system, we utilize Linux Netfilter to implement the soft gateways (deployed within hosts) with the LSEH, TSEH and PSEH methods, respectively. The PRF calculation in these methods employs the HMAC-MD5 algorithm. Subsequently, we utilize the performance testing tool FIO (Axboe 2024; Gudu et al. 2014) for evaluation. Specifically, we employ FIO to initiate 10 processes simultaneously, each writing a 10MB file without optimization. In particular, to verify the deficiency of TSEH, we intentionally set the divergence of system time to 1 s between the client and the server, and configure the synchronizing window to 10 s, which indicates synchronization if their system time modulo 10 is equal. Under varying file block size, the measured throughput and task delay are presented in Figs. 16 and 17. Notably, initiating a large number of sessions in the physical testbed does not facilitate accurate evaluation of the performance overhead. This is primarily owing to two factors: system jitter caused by multi-thread scheduling and packet dropping resulting from NIC (network interface card) congestion. These factors can affect the reliability of performance overhead evaluation.
In Figs. 16 and 17, TSEH_I and TSEH_II represent the scenarios where the timing enters and leaves the synchronizing window, respectively. It can be observed that: (1) For file block sizes of 4 KB or 8 KB, TSEH shows lower performance owing to the lack of synchronization between the client and the server (e.g., packet sent at 59 s and received at 0 s). Unsynchronized segments would be dropped and lead to retransmission. (2) For file block sizes of 32 KB or 128 KB, when the task is completed within the synchronizing window, TSEH_I outperforms others owing to its efficient PRF calculation with timestamp, as referred to Fig. 17c, d. Conversely, TSEH_II exhibits the worst performance owing to the unsynchronized segments. Therefore, the deficiency of TSEH is demonstrated. (3) LSEH and PSEH exhibit higher stability. Furthermore, LSEH outperforms PSEH, as PSEH takes more time for PRF calculation with payload of more than 1000 bytes. Compared to the scenario where end hopping methods are not deployed, the performance overhead introduced by LSEH is at most 5.4% when the file block size is set to 32 KB. Therefore, the performance overhead of LSEH is acceptable.
The overhead of FSI mainly lies in the consumption of CPU resources within the SDN controller. Therefore, to evaluate the performance overhead, we measure the CPU usage of the SDN controller under varying numbers of storage domains. Specifically, we utilize Ryu and Mininet to construct a simulation testbed, which consists of varying number of switches. In this testbed, each switch is connected to a host that represents a storage domain, and these switches are connected in a linear topology for stress testing. Subsequently, we set the injection rate to 100 segments per second, and record CPU usage as the number of storage domains increases. The results, shown in Fig. 18, indicate that the CPU usage increases linearly. Even when the number of storage domains reaches 100, the CPU usage does not exceed 35%. It demonstrates that the FSI mechanism is lightweight.
Conclusion
In the context of SDN-based storage environment, we introduced an adversary model to uncover the deficiencies of exiting end hopping methods. To address the deficiencies, we proposed a novel linkage-field based self-synchronizing end hopping method. Additionally, considering that brute-force reassembly of a small number of sessions can render end hopping methods ineffective, we proposed a fake segment injection method. Finally, we conducted extensive experiments to evaluate the effectiveness of these methods and their performance overhead. The experimental results demonstrate that the proposed methods can effectively defend against eavesdropping attack with acceptable performance overhead. While the proposed methods in this study primarily focus on storage systems and the IPv4/TCP protocol, their underlying principles can also be extended to other application scenarios, such as legacy network (Govil et al. 2020), MANET (Wang et al. 2021) and IoT (Liu et al. 2021).
However, there are certain limitations in the soft gateways implemented with Linux Netfilter. Firstly, the implementation lacks transparency, as the soft gateways are deployed within hosts for simplicity. Secondly, the performance of the gateways still needs improvement. To address these issues in the future work, we are considering the implementation of hardware gateways based on P4 or FPGA technologies, which are anticipated to provide superior transparency and performance.
Availability of data and materials
Our data and codes are provided at https://github.com/fragileeye/Proactive-defense.
References
Akhunzada A, Ahmed E, Gani A, Khan MK, Imran M, Guizani S (2015) Securing software defined networks: taxonomy, requirements, and open issues. IEEE Commun Mag 53(4):36–44
Alneyadi S, Sithirasenan E, Muthukkumarasamy V (2016) A survey on data leakage prevention systems. J Netw Comput Appl 62:137–152
Aseeri A, Netjinda N, Hewett R (2017) Alleviating eavesdropping attacks in software-defined networking data plane. In: Proceedings of the 12th annual conference on cyber and information security research, pp 1–8
Axboe J (2024) FIO. https://github.com/axboe/fio/
Bandt C, Pompe B (2002) Permutation entropy: a natural complexity measure for time series. Phys Rev Lett 88(17):174102
Beham A (2024) SimSharp. https://github.com/heal-research/SimSharp/
Benson T (2024) UNI1. https://pages.cs.wisc.edu/~tbenson/IMC10_Data.html
Biondi P (2024) Scapy community: Scapy. https://scapy.net/
Chang S-Y, Park Y, Babu BBA (2018) Fast IP hopping randomization to secure hop-by-hop access in SDN. IEEE Trans Netw Serv Manag 16(1):308–320
Chen FC, He WZ, Cheng GZ et al (2020) Design of key technologies for intranet dynamic gateway based on DPDK. J Commun 41(6):139–151
Connell W, Menasce DA, Albanese M (2018) Performance modeling of moving target defenses with reconfiguration limits. IEEE Trans Depend Secure Comput 18(1):205–219
Cunha VA, Corujo D, Barraca JP, Aguiar RL (2021) TOTP moving target defense for sensitive network services. Pervasive Mob Comput 74:101412
Datta T, Feamster N, Rexford J, Wang L (2019) Spine: surveillance protection in the network elements. In: 9th USENIX workshop on free and open communications on the internet (FOCI 19)
DeCusatis C, Lynch RM, Kluge W, Houston J, Wojciak PA, Guendert S (2019) Impact of cyberattacks on precision time protocol. IEEE Trans Instrum Meas 69(5):2172–2181
Diamond S (2024) CVXPY community: CVXPY. https://www.cvxpy.org/
Duan Q, Al-Shaer E, Jafarian H (2013) Efficient random route mutation considering flow and network constraints. In: 2013 IEEE conference on communications and network security (CNS). IEEE, pp 260–268
Fan Y, Wu G, Li K-C, Castiglione A (2022) Robust end hopping for secure satellite communication in moving target defense. IEEE Internet Things J 9(18):16908–16916
Govil Y, Wang L, Rexford J (2020) MIMIQ: masking IPs with migration in QUIC. In: 10th USENIX workshop on free and open communications on the internet (FOCI 20)
Gregorczyk M, Żórawski P, Nowakowski P, Cabaj K, Mazurczyk W (2020) Sniffing detection based on network traffic probing and machine learning. IEEE Access 8:149255–149269
Gudu D, Hardt M, Streit A (2014) Evaluating the performance and scalability of the Ceph distributed storage system. In: 2014 IEEE international conference on big data (Big Data). IEEE, pp 177–182
Guillen L, Izumi S, Abe T, Suganuma T, Muraoka H (2018) SDN-based hybrid server and link load balancing in multipath distributed storage systems. In: NOMS 2018-2018 IEEE/IFIP network operations and management symposium. IEEE, pp 1–6
Handley M, Raiciu C, Agache A, Voinescu A, Moore AW, Antichi G, Wójcik M (2017) Re-architecting datacenter networks and stacks for low latency and high performance. In: Proceedings of the conference of the ACM special interest group on data communication, pp 29–42
Jafarian JH, Al-Shaer E, Duan Q (2012) Openflow random host mutation: transparent moving target defense using software defined networking. In: Proceedings of the first workshop on hot topics in software defined networks, pp 127–132
Jafarian JHH, Al-Shaer E, Duan Q (2014) Spatio-temporal address mutation for proactive cyber agility against sophisticated attackers. In: Proceedings of the first ACM workshop on moving target defense, pp 69–78
Jafarian JH, Niakanlahiji A (2023) MultiRHM: defeating multi-staged enterprise intrusion attacks through multi-dimensional and multi-parameter host identity anonymization. Comput Secur 124:102958
Khan AN, Qureshi K, Khan S (2012) An intelligent approach of sniffer detection. Int Arab J Inf Technol 9(1):9–15
Kher V, Kim Y (2005) Securing distributed storage: challenges, techniques, and systems. In: Proceedings of the 2005 ACM workshop on storage security and survivability, pp 9–25
Liu G, Quan W, Cheng N, Gao D, Lu N, Zhang H, Shen X (2021) Softwarized IoT network immunity against eavesdropping with programmable data planes. IEEE Internet Things J 8(8):6578–6590
Luo Y-B, Wang B-S, Wang X-F, Zhang B-F, Hu W (2017) RPAH: a moving target network defense mechanism naturally resists reconnaissances and attacks. IEICE Trans Inf Syst 100(3):496–510
Luo Y-B, Wang B-S, Wang X-F, Zhang B-f (2017) A keyed-hashing based self-synchronization mechanism for port address hopping communication. Front Inf Technol Electron Eng 18(5):719–728
MacFarland DC, Shue CA (2015) The SDN shuffle: creating a moving-target defense using host-based software-defined networking. In: Proceedings of the second ACM workshop on moving target defense, pp 37–41
Ma T, Xu C, Yang S, Huang Y, An Q, Kuang X, Grieco LA (2023) A mutation-enabled proactive defense against service-oriented man-in-the-middle attack in kubernetes. IEEE Trans Comput
Moghaddam HM, Mosenia A (2019) Anonymizing masses: practical light-weight anonymity at the network level. arXiv preprint arXiv:1911.09642
Pfitzmann A, Hansen M (2005) Anonymity, unlinkability, unobservability, pseudonymity, and identity management—a consolidated proposal for terminology
Saha S, Morrison C, Sprintson A (2017) StorageFlow: SDN-enabled efficient data regeneration for distributed storage systems. In: 2017 IEEE conference on computer communications workshops (INFOCOM WKSHPS). IEEE, pp 187–192
Skowyra R, Bauer K, Dedhia V, Okhravi H (2016) Have no PHEAR: networks without identifiers. In: Proceedings of the 2016 ACM workshop on moving target defense, pp 3–14
Trabelsi Z, Rahmani H, Kaouech K, Frikha M (2004) Malicious sniffing systems detection platform. In: Proceedings of the 2004 international symposium on applications and the internet. IEEE, pp 201–207
Wang P, Zhou M, Ding Z (2021) A two-layer IP hopping-based moving target defense approach to enhancing the security of mobile ad-hoc networks. Sensors 21(7):2355
Wang Y, Yi J, Guo J, Qiao Y, Qi M, Chen Q (2018) A semistructured random identifier protocol for anonymous communication in SDN network. Secur Commun Netw 2018
Xiong X, Ma L, Cui C (2019) Simulation environment of evaluation and optimization for moving target defense: a SimPy approach. In: Proceedings of the 2019 the 9th international conference on communication and network security, pp 114–117
Xu C, Zhang T, Kuang X, Zhou Z, Yu S (2021) Context-aware adaptive route mutation scheme: a reinforcement learning approach. IEEE Internet Things J 8(17):13528–13541
Xu X, Hu H, Liu Y, Tan J, Zhang H, Song H (2022) Moving target defense of routing randomization with deep reinforcement learning against eavesdropping attack. Digit Commun Netw
Yu M, Xie T, He T, McDaniel P, Burke QK (2021) Flow table security in SDN: adversarial reconnaissance and intelligent attacks. IEEE/ACM Trans Netw 29(6):2793–2806
Zhang H, Lei C, Chang D, Yang Y (2017) Network moving target defense technique based on collaborative mutation. Comput Secur 70:51–71
Zhang T, Kuang X, Zhou Z, Gao H, Xu C (2019) An intelligent route mutation mechanism against mixed attack based on security awareness. In: 2019 IEEE global communications conference (GLOBECOM). IEEE, pp 1–6
Zhao Z, Gong D, Lu B, Liu F, Zhang C (2016) SDN-based double hopping communication against sniffer attack. Math Probl Eng 2016
Zhou C, Quan W, Gao D, Liu Z, Yu C, Liu M, Xu Z (2021) AMS: adaptive multipath scheduling mechanism against eavesdropping attacks with programmable data planes. In: 2021 IEEE 5th advanced information technology, electronic and automation control conference (IAEAC). IEEE, vol 5, pp 2357–2361
Acknowledgements
This work was supported by the National Natural Science Foundation of China (61861013), Science and Technology Major Project of Guangxi (AA18118031), and Guangxi Natural Science Foundation (2018GXNSFAA281318). Besides, we would like to extend our sincere appreciation to the editor and reviewers for their time and effort dedicated to the handling of this manuscript.
Funding
National Natural Science Foundation of China (61861013), Science and Technology Major Project of Guangxi (AA18118031), and Guangxi Natural Science Foundation (2018GXNSFAA281318).
Author information
Authors and Affiliations
Contributions
YL: investigation, methodology, software, writing, editing, experiment, review. YW: discussion, review, supervision. HF: discussion, review.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no Conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
In this section, we prove the conclusion that \(Pr_{atk'} \ll Pr_{atk}\) and \(Pr_{atk'}^i \ll Pr_{atk}^i\) when the number of session N is within a practical range. Before this, we discover a latent relationship between the number of segments G and sessions N. Since a complete session contains far more than three segments (including a start segment, an immediate segment and an end segment), we have \(G \gg 3N\).Therefore, the unequal relation (A1) can be obtained. Additionally, according to Eq. (1), Eq. (2) and Eq. (6), the unequal relation (A2) can be obtained. On the basis, we construct the functions \(f_{atk'}\left( x\right) =\left( 1-\frac{1}{M}\right) ^{\frac{x^2}{2}}\), \(f_{atk}\left( x\right) =\frac{1}{x^{x}}\), where \(2 \le x \le 11048\). Here, x indicates the number of sessions within a domain, and its upper boundary is determined by the total number of sessions within the datasets univ_pt1. Denote \(Z\left( x\right)\) as the difference of the logarithm of \(f_{atk'}\left( x\right)\) and \(f_{atk}\left( x\right)\), as presented in (A3). Since \(Z\left( x\right)\) is continuous in the definition domain, it is not difficult to prove that \(Z\left( x\right)\) has a maximum value, and \(\max Z\left( x\right) < 0\). Therefore, \(Pr_{atk'} \ll Pr_{atk}\) is proved. Notably, the increment size of sequence number M is a constant value, such as the value of maximum TCP payload size 1460. With the same method, \(Pr_{atk'}^i \ll Pr_{atk}^i\) can be obtained. To sum up, the conclusion is proved.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Liu, Y., Wang, Y. & Feng, H. A proactive defense method against eavesdropping attack in SDN-based storage environment. Cybersecurity 7, 58 (2024). https://doi.org/10.1186/s42400-024-00255-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s42400-024-00255-3