|
1.INTRODUCTIONWith the rapid development of the Internet, there is a large number of network attacks and cybercriminals in the network. The attacks often hide their identity through several stepping stones, such as in The Onion Router (TOR) network. To track attacks, we need to analyze whether two network flows are correlated. There are two kinds of flow correlation technology have been proposed, named passive traffic analysis and active traffic analysis. The passive methods often need long observation time and large communication costs1. Therefore, the active methods such as network flow watermarking technologies are proposed, which inject watermark into network flow. The network flow watermarking changes traffic features such as sending rates2 and inter-packet delays3. The network flow watermarking has higher accuracy, lower false positive rate, and shorter observation time compared with active methods. Robustness is an important characteristic of flow watermarking technologies, which should be able to resist packet delays and timing interferences4. The detection performance is the most significant factor in improving robustness, which have been analyzed in 20093. To improve detection performance, many codes-based network flow watermarking has been proposed. In 2011, a long Pseudo-Noise (PN) code based Direct Sequence Spread Spectrum (DSSS)5, 6 flow network watermarking is proposed, which modulates each bit with a small segment of long PN codes. It is only resisted to previously discovered detection approaches. In 2011, a coded-based RAINROW7, 8 named C-RAINROW is proposed, which uses Repeat-Accumulate (RA) codes to encode watermark sequence. However, RA codes cannot make full use of unknown length network flow. LT codes9 proposed in 2002 have the capacity of approaching full use of network flow, which can generate a potentially infinite number of encoded symbols. In this paper, we investigate the use of LT codes to encode watermarking before injecting it into network flow, and propose the framework of LT codes-based network flow watermarking. The degree distribution is the key part of LT codes, which determines the performance of LT codes. It is obvious that the detection performance of network flow watermarking is dependent on block error rate (BER) of LT codes. Therefore, we analyze the relationship between the degree distribution and the BER and propose a method to search for the optimal degree distribution through a heuristic algorithm. Simulations show that our proposed LT codes-based network flow watermarking outperforms others in terms of detection performance. 2.TECHNICAL OVERVIEW2.1Interval centroid based flow watermarkingThe method10, 11 offsets σ from the position where the flow starts, and then sequentially extracts the flow periods of length Tt. The extracted flow period is divided into 2rl time slots on average according to the time interval t, recorded as and , and are the number of data packets in and respectively, and are the position of the k-th packet in time intervals and , respectively, that is, the relative time within the time interval. Then the slot centroid cent of and : Since obeys a uniform distribution with the symmetry axis 0, an offset a (0 < a < T) can be applied to or to complete the embedding of the watermark bits by changing the symmetry axis of Yi. 2.2LT codeGiven a degree distribution, an infinite number of coded symbols can be generated by the encoder. Each encoded symbol is generated as follows9.
BP decoding is used in this paper since the computation complexity of it is low. The decoding process of the BP decoding is as follows12:
The degree distribution is the key part of LT codes, which determines the performance of LT codes. A good degree distribution can effectively reduce the decoding overhead and improve the decoding success rate. According to different application requirements, scholars have proposed robust soliton distribution (RSD), binary exponential distribution (BED)13, Poisson distribution (PD), stationary distribution (SD), switch degree distribution14 and Modified Binary Robust Distribution (MBRD)15. In 201l, Yao et al. proposed a new LT code degree distribution constructed by organically combining position distribution and robust soliton distribution (PRSD)16, which is used in this paper. The RSD is shown in formula (3). where The parameter . We can see that the performance of RSD can be improved by searching optimal parameters c,σ. The PD is shown in formula (6). We also can see that the performance of PD can be improved by searching optimal parameter λ. 3.FRAMEWORK OF CODES-BASED NETWORK FLOW WATERMARKINGLet W = {a1, a2,…ak} denotes the watermarks, where k denotes the length of watermark. To solve the problem of missing part of the embedded watermarks, LT coding is used to encode the watermarks before embedding it into network flow, which will improve the performance of detection rate. The method proposed in this paper can adapt to the size characteristics of network flows, which is shown in Figure 1. Watermark embedding node
Watermark extraction node 4.OPTIMAL DEGREE DISTRIBUTIONThe degree distribution is important in improving the detection performance of network flow watermarking. In this paper, the joint degree distribution θ(d) is used, which combined RSD μ(d) and PDω(d). The degree distribution θ(d)is shown in formula (7). where α is the scale factor, which is a constant between 0 and 1. To further improve the performance of LT codes for network flow watermarking, an optimal degree distribution is generated through a heuristic algorithm. Given N encoded symbols, the degree distribution with optimal parameters < α,c,σ,λ ˃ can be obtained to achieve the lowest block error rate (BER). In this section, the ACO is used to optimize the degree distribution of LT codes based on network flow watermarking. The optimization goal of the ACO is to obtain the lowest BER, which is shown in formula (8). where Ber(k,N,α,c,σ, λ)returns the BER of decoding process. The function Ber can be calculated by simulations, which is shown in formula (9). where ei is 0 if the decoding process is success, otherwise is 1. To compare and utilize the search result of different ants, we use pheromone of each ant. Let p denotes the pheromone of ant, which is updated as follow. where b denotes the volatilization rate of pheromone. Let αmin, cmin, σmin, λmin and αmax, cmax, σmax, λmax denote the minimize and maximize value of parameters α,c,σ, λ respectively. The initial value of is shown in formula (11). where r is a random value. Let ∆α,∆c,∆σ,∆λ denote the update step length of parameters α,c,σ,λ respectively, which is shown in formula (12). where t denotes the iteration times and s is search factor. The search factor of each ant may be different. Let υ denotes the probability that an ant is the closest to the optimal solution, which can be calculated as follow. where pmax denotes the maximum p have been searched. The search factor s can be set to a large value if υ is small, otherwise s can be set to a small value. We invalid our propose method by simulations. Figure 2 shows the optimal degree distributions searched by our proposed algorithm. Optimal parameters can be achieved when iteration times more than 20 times. 5.EXPERIMENTAL RESULTS AND ANALYSISIn this section we evaluate the block error rate (BER) of proposed scheme compared with VT codes. The network flow in our simulations is generated from independent passion process of rate 200 packets per second. Let k denotes the length of watermarking in byte, that is 8k bit. The VT0 (6) is used in VT codes. In our simulations, N encoded symbols is generated by VT codes and LT codes by given k input symbols. Then 5N bits are generated by PN codes (1,−1,1,−1,1). Finally, each bit is inserted into network flow by ICB flow watermarking scheme with maximum delay a = 20ms. Network jitters is simulated as Normal distribution with parameters (μ,σ). We assume that mean μ is zero. Figure 3 illustrates the BER for different value of σ. We can see that BER increased as σ increased since the more inter-packet delays are impacted by network jitters. Our proposed schemes outperform VT codes when σ ≥ 4. This is because the loss bit of watermarking can be recovered by redundant encoded symbols in LT codes. We also can see that N=2.5*k outperforms N=2*k since our proposed scheme can take full advantage of network flow. Figure 4 illustrates the BER for different value of k with σ = 4. We can see that the BER of VT codes increased as k increased. This is because the probability of symbol loss increased as k increased. The BER of LT codes decreased as k increased since the performance of LT codes increased as the k increased. It is obvious that our proposed scheme outperforms VT codes when k is large. 6.CONCLUSIONThe flow watermarking scheme based on LT codes for channels with jitter has been developed. The degree distribution of LT codes is optimal which is searched by a heuristic algorithm. LT codes can take full advantage of network flow by inserting more bits of encoded symbols into network flow, which is verified by our simulations. Furthermore, simulations demonstrate that the proposed scheme outperforms VT codes in terms of BER. REFERENCESYe, Z., Fu, X., Graham, B., Bettati, R. and Wei, Z.,
“Correlation-based traffic analysis attacks on anonymity networks,”
IEEE Trans. Parallel Distrib. Syst, 21
(7), 954
–967
(2010). https://doi.org/10.1109/TPDS.2009.146 Google Scholar
Yu, W., Fu, X., Graham, S., Xuan, D. and Zhao, W.,
“DSSS-based flow marking technique for invisible traceback,”
in Proc IEEE Symp. on Security and Privacy,
7
–21
(2007). Google Scholar
Fu, C., Qian, W. Z., Zhao, M. Y. and Qin, Z. G.,
“Delay normalization method of defending against timing-based attacks on anonymous communication systems,”
J. Southeast Univ., Nat. Sci. Ed, 39
(4), 738
–741
(2009). Google Scholar
Iacovazzi, A. and Elovici, Y.,
“Network flow watermarking: A survey,”
IEEE Commun. Surv. Tutorials, 512
–530
(2017). https://doi.org/10.1109/COMST.2016.2604405 Google Scholar
Huang, J., Xian, P., Fu, X. and Jie, W.,
“Long PN code based DSSS watermarking,”
in Proceedings IEEE INFOCOM,
2426
–2434
(2011). Google Scholar
Yu, W., Fu, X., Graham, S., Xuan, D. and Zhao, W.,
“DSSS-based flow marking technique for invisible traceback,”
in IEEE Symposium on Security and Privacy (SP’07),
18
–32
(2007). Google Scholar
Houmansadr, A. and Borisov, N.,
“Towards improving network flow watermarks using the repeat-accumulate codes,”
in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),
1852
–1855
(2011). Google Scholar
Houmansadr, A., Kiyavash, N. and Borisov, N.,
“RAINBOW: A robust and invisible non-blind watermark for network flows,”
in Proceedings of the Network and Distributed System Security Symposium,
406
–422
(2009). Google Scholar
Luby, M.,
“LT Codes,”
in 43rd Symposium on Foundations of Computer Science (FOCS 2002),
271
–280
(2002). Google Scholar
Hua, J. and Cheng, W.,
“Progress in research on active network flow watermark,”
Application Research of Computers, 37
(7), 1924
–1930
(2020). Google Scholar
Wang, X. and Reeves, D. S.,
“Robust correlation of encrypted attack traffic through stepping stones by manipulation of interpacket delays,”
in Proceedings of the 10th ACM Conference on Computer and Communications Security,
20
–29
(2003). Google Scholar
Zhang, Y.,
“The LT code decoding algorithm,”
HENAN SCIENCE, 1643
–1646
(2013). Google Scholar
Agha, K. A., Kadi, N. and Stojmenovic, I.,
“Fountain codes with XOR of encoded packets for broadcasting and source independent backbone in multi-hop networks using network coding,”
in IEEE Vehicular Technology Conference,
1
–5
(2009). Google Scholar
Lei, W. J., Liu, H. F. and Xie, X. Z.,
“Switch degree distribution: an improved degree distribution for LT digital fountain code,”
Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition), 34
–38
(2012). Google Scholar
Lei, W. J., Zhang, M., Xie, X. Z.,
“A design scheme for LT codes degree distribution by combining degree distributions and optimizing ripple size,”
ACTA ELECTRONICA SINICA, 43
(4), 800
–805
(2015). Google Scholar
Yao, W., Yi, B., Huang, T. and Li, W.,
“Poisson robust soliton distribution (PRSD) for LT codes,”
IEEE Communications Letters, 20
(8), 1499
–1502
(2016). https://doi.org/10.1109/LCOMM.2016.2578920 Google Scholar
|