Nothing Special   »   [go: up one dir, main page]

Open Access Paper
28 December 2022 An optimal LT code for network flow watermarking
Sheng Gao, Lei Zhang
Author Affiliations +
Proceedings Volume 12506, Third International Conference on Computer Science and Communication Technology (ICCSCT 2022); 1250614 (2022) https://doi.org/10.1117/12.2662345
Event: International Conference on Computer Science and Communication Technology (ICCSCT 2022), 2022, Beijing, China
Abstract
Network flow watermarking has been employed to analyze whether two flows are correlated, which is often applied in non-cooperative networks. The detection performance of network flow watermarking is poor due to delay jitter, packets loss etc. In this paper, we improve the detection performance of network flow watermarking by using Luby Transform (LT) codes, which can take full advantage of network flow by automatically adapting to the length of network flow. To further improve the performance of LT codes for network flow watermarking, an optimal degree distribution is generated through a heuristic algorithm. Theoretical analysis and simulations show that our proposed method outperforms others in terms of the detection performance.

1.

INTRODUCTION

With the rapid development of the Internet, there is a large number of network attacks and cybercriminals in the network. The attacks often hide their identity through several stepping stones, such as in The Onion Router (TOR) network. To track attacks, we need to analyze whether two network flows are correlated. There are two kinds of flow correlation technology have been proposed, named passive traffic analysis and active traffic analysis. The passive methods often need long observation time and large communication costs1. Therefore, the active methods such as network flow watermarking technologies are proposed, which inject watermark into network flow. The network flow watermarking changes traffic features such as sending rates2 and inter-packet delays3. The network flow watermarking has higher accuracy, lower false positive rate, and shorter observation time compared with active methods.

Robustness is an important characteristic of flow watermarking technologies, which should be able to resist packet delays and timing interferences4. The detection performance is the most significant factor in improving robustness, which have been analyzed in 20093. To improve detection performance, many codes-based network flow watermarking has been proposed. In 2011, a long Pseudo-Noise (PN) code based Direct Sequence Spread Spectrum (DSSS)5, 6 flow network watermarking is proposed, which modulates each bit with a small segment of long PN codes. It is only resisted to previously discovered detection approaches. In 2011, a coded-based RAINROW7, 8 named C-RAINROW is proposed, which uses Repeat-Accumulate (RA) codes to encode watermark sequence. However, RA codes cannot make full use of unknown length network flow.

LT codes9 proposed in 2002 have the capacity of approaching full use of network flow, which can generate a potentially infinite number of encoded symbols. In this paper, we investigate the use of LT codes to encode watermarking before injecting it into network flow, and propose the framework of LT codes-based network flow watermarking. The degree distribution is the key part of LT codes, which determines the performance of LT codes. It is obvious that the detection performance of network flow watermarking is dependent on block error rate (BER) of LT codes. Therefore, we analyze the relationship between the degree distribution and the BER and propose a method to search for the optimal degree distribution through a heuristic algorithm. Simulations show that our proposed LT codes-based network flow watermarking outperforms others in terms of detection performance.

2.

TECHNICAL OVERVIEW

2.1

Interval centroid based flow watermarking

The method10, 11 offsets σ from the position where the flow starts, and then sequentially extracts the flow periods of length Tt. The extracted flow period is divided into 2rl time slots on average according to the time interval t, recorded as 00064_PSISDG12506_1250614_page_2_1.jpg and 00064_PSISDG12506_1250614_page_2_2.jpg, 00064_PSISDG12506_1250614_page_2_3.jpg and 00064_PSISDG12506_1250614_page_2_4.jpg are the number of data packets in 00064_PSISDG12506_1250614_page_2_5.jpg and 00064_PSISDG12506_1250614_page_2_6.jpg respectively, 00064_PSISDG12506_1250614_page_2_7.jpg and 00064_PSISDG12506_1250614_page_2_8.jpg are the position of the k-th packet in time intervals 00064_PSISDG12506_1250614_page_2_9.jpg and 00064_PSISDG12506_1250614_page_2_10.jpg, respectively, that is, the relative time within the time interval. Then the slot centroid cent of 00064_PSISDG12506_1250614_page_2_11.jpg and 00064_PSISDG12506_1250614_page_2_12.jpg:

00064_PSISDG12506_1250614_page_2_13.jpg
00064_PSISDG12506_1250614_page_2_14.jpg

Since 00064_PSISDG12506_1250614_page_2_15.jpg obeys a uniform distribution with the symmetry axis 0, an offset a (0 < a < T) can be applied to 00064_PSISDG12506_1250614_page_2_16.jpg or 00064_PSISDG12506_1250614_page_2_17.jpg to complete the embedding of the watermark bits by changing the symmetry axis of Yi.

2.2

LT code

Given a degree distribution, an infinite number of coded symbols can be generated by the encoder. Each encoded symbol is generated as follows9.

  • (1) Select a degree d according to the degree distribution.

  • (2) Randomly select d input symbols.

  • (3) XOR these d input symbols and the result is the generated encoded symbol.

BP decoding is used in this paper since the computation complexity of it is low. The decoding process of the BP decoding is as follows12:

  • (1) When the receiver receives a certain number of encoded symbols, a bidirectional graph is established according to the known correspondence between the encoded symbols and the input symbols;

  • (2) If there is an encoded symbol with a degree 1, restore the corresponding input symbol and remove the encoded symbol from the bidirectional graph. If it does not exist, the decoding stops;

  • (3) Remove the recovered input symbols from other encoded symbols participating in XOR, and reduce the degree of the encoded symbols by one;

  • (4) Repeat steps 2 and 3, if all the input symbols are recovered, the decoding is successful. Otherwise, the decoding fails.

The degree distribution is the key part of LT codes, which determines the performance of LT codes. A good degree distribution can effectively reduce the decoding overhead and improve the decoding success rate. According to different application requirements, scholars have proposed robust soliton distribution (RSD), binary exponential distribution (BED)13, Poisson distribution (PD), stationary distribution (SD), switch degree distribution14 and Modified Binary Robust Distribution (MBRD)15. In 201l, Yao et al. proposed a new LT code degree distribution constructed by organically combining position distribution and robust soliton distribution (PRSD)16, which is used in this paper. The RSD is shown in formula (3).

00064_PSISDG12506_1250614_page_2_18.jpg

where

00064_PSISDG12506_1250614_page_3_1.jpg
00064_PSISDG12506_1250614_page_3_2.jpg

The parameter 00064_PSISDG12506_1250614_page_3_3.jpg. We can see that the performance of RSD can be improved by searching optimal parameters c,σ. The PD is shown in formula (6).

00064_PSISDG12506_1250614_page_3_4.jpg

We also can see that the performance of PD can be improved by searching optimal parameter λ.

3.

FRAMEWORK OF CODES-BASED NETWORK FLOW WATERMARKING

Let W = {a1, a2,…ak} denotes the watermarks, where k denotes the length of watermark. To solve the problem of missing part of the embedded watermarks, LT coding is used to encode the watermarks before embedding it into network flow, which will improve the performance of detection rate. The method proposed in this paper can adapt to the size characteristics of network flows, which is shown in Figure 1.

Figure 1.

Network frame diagram.

00064_PSISDG12506_1250614_page_3_5.jpg

Watermark embedding node

  • (1) Generate an encoded symbol si with length l from W by using LT codes.

  • (2) Generate a symbol ui with length l×m from encoded symbol si by a PN codes P with length m.

  • (3) Each bit in symbol ui is embedded into network flow by using ICBW algorithm.

Watermark extraction node

  • (1) Exact l×m bit from network flow, and form a symbol 00064_PSISDG12506_1250614_page_4_1.jpg.

  • (2) The encoded symbol of LT codes si is generated from 00064_PSISDG12506_1250614_page_4_2.jpg by the PN codes P.

  • (3) When enough encoded symbols si are received, W can be recovered by LT codes.

4.

OPTIMAL DEGREE DISTRIBUTION

The degree distribution is important in improving the detection performance of network flow watermarking. In this paper, the joint degree distribution θ(d) is used, which combined RSD μ(d) and PDω(d). The degree distribution θ(d)is shown in formula (7).

00064_PSISDG12506_1250614_page_4_3.jpg

where α is the scale factor, which is a constant between 0 and 1. To further improve the performance of LT codes for network flow watermarking, an optimal degree distribution is generated through a heuristic algorithm. Given N encoded symbols, the degree distribution with optimal parameters < α,c,σ,λ ˃ can be obtained to achieve the lowest block error rate (BER). In this section, the ACO is used to optimize the degree distribution of LT codes based on network flow watermarking. The optimization goal of the ACO is to obtain the lowest BER, which is shown in formula (8).

00064_PSISDG12506_1250614_page_4_4.jpg

where Ber(k,N,α,c,σ, λ)returns the BER of decoding process. The function Ber can be calculated by simulations, which is shown in formula (9).

00064_PSISDG12506_1250614_page_4_5.jpg

where ei is 0 if the decoding process is success, otherwise is 1.

To compare and utilize the search result of different ants, we use pheromone of each ant. Let p denotes the pheromone of ant, which is updated as follow.

00064_PSISDG12506_1250614_page_4_6.jpg

where b denotes the volatilization rate of pheromone.

Let αmin, cmin, σmin, λmin and αmax, cmax, σmax, λmax denote the minimize and maximize value of parameters α,c,σ, λ respectively. The initial value of is shown in formula (11).

00064_PSISDG12506_1250614_page_4_7.jpg

where r is a random value.

Let ∆α,∆c,∆σ,∆λ denote the update step length of parameters α,c,σ,λ respectively, which is shown in formula (12).

00064_PSISDG12506_1250614_page_5_1.jpg

where t denotes the iteration times and s is search factor. The search factor of each ant may be different. Let υ denotes the probability that an ant is the closest to the optimal solution, which can be calculated as follow.

00064_PSISDG12506_1250614_page_5_2.jpg

where pmax denotes the maximum p have been searched. The search factor s can be set to a large value if υ is small, otherwise s can be set to a small value.

We invalid our propose method by simulations. Figure 2 shows the optimal degree distributions searched by our proposed algorithm. Optimal parameters can be achieved when iteration times more than 20 times.

Figure 2.

BER as a function of iterations T.

00064_PSISDG12506_1250614_page_5_3.jpg

5.

EXPERIMENTAL RESULTS AND ANALYSIS

In this section we evaluate the block error rate (BER) of proposed scheme compared with VT codes. The network flow in our simulations is generated from independent passion process of rate 200 packets per second. Let k denotes the length of watermarking in byte, that is 8k bit. The VT0 (6) is used in VT codes. In our simulations, N encoded symbols is generated by VT codes and LT codes by given k input symbols. Then 5N bits are generated by PN codes (1,−1,1,−1,1). Finally, each bit is inserted into network flow by ICB flow watermarking scheme with maximum delay a = 20ms. Network jitters is simulated as Normal distribution with parameters (μ,σ). We assume that mean μ is zero.

Figure 3 illustrates the BER for different value of σ. We can see that BER increased as σ increased since the more inter-packet delays are impacted by network jitters. Our proposed schemes outperform VT codes when σ ≥ 4. This is because the loss bit of watermarking can be recovered by redundant encoded symbols in LT codes. We also can see that N=2.5*k outperforms N=2*k since our proposed scheme can take full advantage of network flow.

Figure 3.

BER as a function of σ.

00064_PSISDG12506_1250614_page_6_1.jpg

Figure 4 illustrates the BER for different value of k with σ = 4. We can see that the BER of VT codes increased as k increased. This is because the probability of symbol loss increased as k increased. The BER of LT codes decreased as k increased since the performance of LT codes increased as the k increased. It is obvious that our proposed scheme outperforms VT codes when k is large.

Figure 4.

BER as a function of k.

00064_PSISDG12506_1250614_page_6_2.jpg

6.

CONCLUSION

The flow watermarking scheme based on LT codes for channels with jitter has been developed. The degree distribution of LT codes is optimal which is searched by a heuristic algorithm. LT codes can take full advantage of network flow by inserting more bits of encoded symbols into network flow, which is verified by our simulations. Furthermore, simulations demonstrate that the proposed scheme outperforms VT codes in terms of BER.

REFERENCES

[1] 

Ye, Z., Fu, X., Graham, B., Bettati, R. and Wei, Z., “Correlation-based traffic analysis attacks on anonymity networks,” IEEE Trans. Parallel Distrib. Syst, 21 (7), 954 –967 (2010). https://doi.org/10.1109/TPDS.2009.146 Google Scholar

[2] 

Yu, W., Fu, X., Graham, S., Xuan, D. and Zhao, W., “DSSS-based flow marking technique for invisible traceback,” in Proc IEEE Symp. on Security and Privacy, 7 –21 (2007). Google Scholar

[3] 

Fu, C., Qian, W. Z., Zhao, M. Y. and Qin, Z. G., “Delay normalization method of defending against timing-based attacks on anonymous communication systems,” J. Southeast Univ., Nat. Sci. Ed, 39 (4), 738 –741 (2009). Google Scholar

[4] 

Iacovazzi, A. and Elovici, Y., “Network flow watermarking: A survey,” IEEE Commun. Surv. Tutorials, 512 –530 (2017). https://doi.org/10.1109/COMST.2016.2604405 Google Scholar

[5] 

Huang, J., Xian, P., Fu, X. and Jie, W., “Long PN code based DSSS watermarking,” in Proceedings IEEE INFOCOM, 2426 –2434 (2011). Google Scholar

[6] 

Yu, W., Fu, X., Graham, S., Xuan, D. and Zhao, W., “DSSS-based flow marking technique for invisible traceback,” in IEEE Symposium on Security and Privacy (SP’07), 18 –32 (2007). Google Scholar

[7] 

Houmansadr, A. and Borisov, N., “Towards improving network flow watermarks using the repeat-accumulate codes,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1852 –1855 (2011). Google Scholar

[8] 

Houmansadr, A., Kiyavash, N. and Borisov, N., “RAINBOW: A robust and invisible non-blind watermark for network flows,” in Proceedings of the Network and Distributed System Security Symposium, 406 –422 (2009). Google Scholar

[9] 

Luby, M., “LT Codes,” in 43rd Symposium on Foundations of Computer Science (FOCS 2002), 271 –280 (2002). Google Scholar

[10] 

Hua, J. and Cheng, W., “Progress in research on active network flow watermark,” Application Research of Computers, 37 (7), 1924 –1930 (2020). Google Scholar

[11] 

Wang, X. and Reeves, D. S., “Robust correlation of encrypted attack traffic through stepping stones by manipulation of interpacket delays,” in Proceedings of the 10th ACM Conference on Computer and Communications Security, 20 –29 (2003). Google Scholar

[12] 

Zhang, Y., “The LT code decoding algorithm,” HENAN SCIENCE, 1643 –1646 (2013). Google Scholar

[13] 

Agha, K. A., Kadi, N. and Stojmenovic, I., “Fountain codes with XOR of encoded packets for broadcasting and source independent backbone in multi-hop networks using network coding,” in IEEE Vehicular Technology Conference, 1 –5 (2009). Google Scholar

[14] 

Lei, W. J., Liu, H. F. and Xie, X. Z., “Switch degree distribution: an improved degree distribution for LT digital fountain code,” Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition), 34 –38 (2012). Google Scholar

[15] 

Lei, W. J., Zhang, M., Xie, X. Z., “A design scheme for LT codes degree distribution by combining degree distributions and optimizing ripple size,” ACTA ELECTRONICA SINICA, 43 (4), 800 –805 (2015). Google Scholar

[16] 

Yao, W., Yi, B., Huang, T. and Li, W., “Poisson robust soliton distribution (PRSD) for LT codes,” IEEE Communications Letters, 20 (8), 1499 –1502 (2016). https://doi.org/10.1109/LCOMM.2016.2578920 Google Scholar
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Sheng Gao and Lei Zhang "An optimal LT code for network flow watermarking", Proc. SPIE 12506, Third International Conference on Computer Science and Communication Technology (ICCSCT 2022), 1250614 (28 December 2022); https://doi.org/10.1117/12.2662345
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Digital watermarking

RELATED CONTENT


Back to Top