An optimal LT code for network flow watermarking

28 December 2022 An optimal LT code for network flow watermarking

Sheng Gao, Lei Zhang

Proceedings Volume 12506, Third International Conference on Computer Science and Communication Technology (ICCSCT 2022); 1250614 (2022) https://doi.org/10.1117/12.2662345
Event: International Conference on Computer Science and Communication Technology (ICCSCT 2022), 2022, Beijing, China

Abstract

Network flow watermarking has been employed to analyze whether two flows are correlated, which is often applied in non-cooperative networks. The detection performance of network flow watermarking is poor due to delay jitter, packets loss etc. In this paper, we improve the detection performance of network flow watermarking by using Luby Transform (LT) codes, which can take full advantage of network flow by automatically adapting to the length of network flow. To further improve the performance of LT codes for network flow watermarking, an optimal degree distribution is generated through a heuristic algorithm. Theoretical analysis and simulations show that our proposed method outperforms others in terms of the detection performance.

1. INTRODUCTION

With the rapid development of the Internet, there is a large number of network attacks and cybercriminals in the network. The attacks often hide their identity through several stepping stones, such as in The Onion Router (TOR) network. To track attacks, we need to analyze whether two network flows are correlated. There are two kinds of flow correlation technology have been proposed, named passive traffic analysis and active traffic analysis. The passive methods often need long observation time and large communication costs¹. Therefore, the active methods such as network flow watermarking technologies are proposed, which inject watermark into network flow. The network flow watermarking changes traffic features such as sending rates² and inter-packet delays³. The network flow watermarking has higher accuracy, lower false positive rate, and shorter observation time compared with active methods.

Robustness is an important characteristic of flow watermarking technologies, which should be able to resist packet delays and timing interferences⁴. The detection performance is the most significant factor in improving robustness, which have been analyzed in 2009³. To improve detection performance, many codes-based network flow watermarking has been proposed. In 2011, a long Pseudo-Noise (PN) code based Direct Sequence Spread Spectrum (DSSS)⁵^, ⁶ flow network watermarking is proposed, which modulates each bit with a small segment of long PN codes. It is only resisted to previously discovered detection approaches. In 2011, a coded-based RAINROW⁷^, ⁸ named C-RAINROW is proposed, which uses Repeat-Accumulate (RA) codes to encode watermark sequence. However, RA codes cannot make full use of unknown length network flow.

LT codes⁹ proposed in 2002 have the capacity of approaching full use of network flow, which can generate a potentially infinite number of encoded symbols. In this paper, we investigate the use of LT codes to encode watermarking before injecting it into network flow, and propose the framework of LT codes-based network flow watermarking. The degree distribution is the key part of LT codes, which determines the performance of LT codes. It is obvious that the detection performance of network flow watermarking is dependent on block error rate (BER) of LT codes. Therefore, we analyze the relationship between the degree distribution and the BER and propose a method to search for the optimal degree distribution through a heuristic algorithm. Simulations show that our proposed LT codes-based network flow watermarking outperforms others in terms of detection performance.

2. TECHNICAL OVERVIEW

2.1

Interval centroid based flow watermarking

The method¹⁰^, ¹¹ offsets σ from the position where the flow starts, and then sequentially extracts the flow periods of length Tt. The extracted flow period is divided into 2rl time slots on average according to the time interval t, recorded as and , and are the number of data packets in and respectively, and are the position of the k-th packet in time intervals and , respectively, that is, the relative time within the time interval. Then the slot centroid cent of and :

Since obeys a uniform distribution with the symmetry axis 0, an offset a (0 < a < T) can be applied to or to complete the embedding of the watermark bits by changing the symmetry axis of Y_i.

2.2

LT code

Given a degree distribution, an infinite number of coded symbols can be generated by the encoder. Each encoded symbol is generated as follows⁹.

(1) Select a degree d according to the degree distribution.
(2) Randomly select d input symbols.
(3) XOR these d input symbols and the result is the generated encoded symbol.

BP decoding is used in this paper since the computation complexity of it is low. The decoding process of the BP decoding is as follows¹²:

(1) When the receiver receives a certain number of encoded symbols, a bidirectional graph is established according to the known correspondence between the encoded symbols and the input symbols;
(2) If there is an encoded symbol with a degree 1, restore the corresponding input symbol and remove the encoded symbol from the bidirectional graph. If it does not exist, the decoding stops;
(3) Remove the recovered input symbols from other encoded symbols participating in XOR, and reduce the degree of the encoded symbols by one;
(4) Repeat steps 2 and 3, if all the input symbols are recovered, the decoding is successful. Otherwise, the decoding fails.

The degree distribution is the key part of LT codes, which determines the performance of LT codes. A good degree distribution can effectively reduce the decoding overhead and improve the decoding success rate. According to different application requirements, scholars have proposed robust soliton distribution (RSD), binary exponential distribution (BED)¹³, Poisson distribution (PD), stationary distribution (SD), switch degree distribution¹⁴ and Modified Binary Robust Distribution (MBRD)¹⁵. In 201l, Yao et al. proposed a new LT code degree distribution constructed by organically combining position distribution and robust soliton distribution (PRSD)¹⁶, which is used in this paper. The RSD is shown in formula (3).

where

The parameter . We can see that the performance of RSD can be improved by searching optimal parameters c,σ. The PD is shown in formula (6).

We also can see that the performance of PD can be improved by searching optimal parameter λ.

3. FRAMEWORK OF CODES-BASED NETWORK FLOW WATERMARKING

Let W = {a₁, a₂,…a_k} denotes the watermarks, where k denotes the length of watermark. To solve the problem of missing part of the embedded watermarks, LT coding is used to encode the watermarks before embedding it into network flow, which will improve the performance of detection rate. The method proposed in this paper can adapt to the size characteristics of network flows, which is shown in Figure 1.

Figure 1.

Network frame diagram.

Watermark embedding node

(1) Generate an encoded symbol s_i with length l from W by using LT codes.
(2) Generate a symbol u_i with length l×m from encoded symbol s_i by a PN codes P with length m.
(3) Each bit in symbol u_i is embedded into network flow by using ICBW algorithm.

Watermark extraction node

(1) Exact l×m bit from network flow, and form a symbol .
(2) The encoded symbol of LT codes s_i is generated from by the PN codes P.
(3) When enough encoded symbols s_i are received, W can be recovered by LT codes.

4. OPTIMAL DEGREE DISTRIBUTION

The degree distribution is important in improving the detection performance of network flow watermarking. In this paper, the joint degree distribution θ(d) is used, which combined RSD μ(d) and PDω(d). The degree distribution θ(d)is shown in formula (7).

where α is the scale factor, which is a constant between 0 and 1. To further improve the performance of LT codes for network flow watermarking, an optimal degree distribution is generated through a heuristic algorithm. Given N encoded symbols, the degree distribution with optimal parameters < α,c,σ,λ ˃ can be obtained to achieve the lowest block error rate (BER). In this section, the ACO is used to optimize the degree distribution of LT codes based on network flow watermarking. The optimization goal of the ACO is to obtain the lowest BER, which is shown in formula (8).

where Ber(k,N,α,c,σ, λ)returns the BER of decoding process. The function Ber can be calculated by simulations, which is shown in formula (9).

where e_i is 0 if the decoding process is success, otherwise is 1.

To compare and utilize the search result of different ants, we use pheromone of each ant. Let p denotes the pheromone of ant, which is updated as follow.

where b denotes the volatilization rate of pheromone.

Let α_min, c_min, σ_min, λ_min and α_max, c_max, σ_max, λ_max denote the minimize and maximize value of parameters α,c,σ, λ respectively. The initial value of is shown in formula (11).

where r is a random value.

Let ∆α,∆c,∆σ,∆λ denote the update step length of parameters α,c,σ,λ respectively, which is shown in formula (12).

where t denotes the iteration times and s is search factor. The search factor of each ant may be different. Let υ denotes the probability that an ant is the closest to the optimal solution, which can be calculated as follow.

where p^max denotes the maximum p have been searched. The search factor s can be set to a large value if υ is small, otherwise s can be set to a small value.

We invalid our propose method by simulations. Figure 2 shows the optimal degree distributions searched by our proposed algorithm. Optimal parameters can be achieved when iteration times more than 20 times.

Figure 2.

BER as a function of iterations T.

5. EXPERIMENTAL RESULTS AND ANALYSIS

In this section we evaluate the block error rate (BER) of proposed scheme compared with VT codes. The network flow in our simulations is generated from independent passion process of rate 200 packets per second. Let k denotes the length of watermarking in byte, that is 8k bit. The VT₀ (6) is used in VT codes. In our simulations, N encoded symbols is generated by VT codes and LT codes by given k input symbols. Then 5N bits are generated by PN codes (1,−1,1,−1,1). Finally, each bit is inserted into network flow by ICB flow watermarking scheme with maximum delay a = 20ms. Network jitters is simulated as Normal distribution with parameters (μ,σ). We assume that mean μ is zero.

Figure 3 illustrates the BER for different value of σ. We can see that BER increased as σ increased since the more inter-packet delays are impacted by network jitters. Our proposed schemes outperform VT codes when σ ≥ 4. This is because the loss bit of watermarking can be recovered by redundant encoded symbols in LT codes. We also can see that N=2.5*k outperforms N=2*k since our proposed scheme can take full advantage of network flow.

Figure 3.

BER as a function of σ.

Figure 4 illustrates the BER for different value of k with σ = 4. We can see that the BER of VT codes increased as k increased. This is because the probability of symbol loss increased as k increased. The BER of LT codes decreased as k increased since the performance of LT codes increased as the k increased. It is obvious that our proposed scheme outperforms VT codes when k is large.

Figure 4.

BER as a function of k.

6. CONCLUSION

The flow watermarking scheme based on LT codes for channels with jitter has been developed. The degree distribution of LT codes is optimal which is searched by a heuristic algorithm. LT codes can take full advantage of network flow by inserting more bits of encoded symbols into network flow, which is verified by our simulations. Furthermore, simulations demonstrate that the proposed scheme outperforms VT codes in terms of BER.

REFERENCES

[1]

Ye, Z., Fu, X., Graham, B., Bettati, R. and Wei, Z., “Correlation-based traffic analysis attacks on anonymity networks,” IEEE Trans. Parallel Distrib. Syst, 21 (7), 954 –967 (2010). https://doi.org/10.1109/TPDS.2009.146 Google Scholar

[2]

Yu, W., Fu, X., Graham, S., Xuan, D. and Zhao, W., “DSSS-based flow marking technique for invisible traceback,” in Proc IEEE Symp. on Security and Privacy, 7 –21 (2007). Google Scholar

[3]

Fu, C., Qian, W. Z., Zhao, M. Y. and Qin, Z. G., “Delay normalization method of defending against timing-based attacks on anonymous communication systems,” J. Southeast Univ., Nat. Sci. Ed, 39 (4), 738 –741 (2009). Google Scholar

[4]

Iacovazzi, A. and Elovici, Y., “Network flow watermarking: A survey,” IEEE Commun. Surv. Tutorials, 512 –530 (2017). https://doi.org/10.1109/COMST.2016.2604405 Google Scholar

[5]

Huang, J., Xian, P., Fu, X. and Jie, W., “Long PN code based DSSS watermarking,” in Proceedings IEEE INFOCOM, 2426 –2434 (2011). Google Scholar

[6]

Yu, W., Fu, X., Graham, S., Xuan, D. and Zhao, W., “DSSS-based flow marking technique for invisible traceback,” in IEEE Symposium on Security and Privacy (SP’07), 18 –32 (2007). Google Scholar

[7]

Houmansadr, A. and Borisov, N., “Towards improving network flow watermarks using the repeat-accumulate codes,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1852 –1855 (2011). Google Scholar

[8]

Houmansadr, A., Kiyavash, N. and Borisov, N., “RAINBOW: A robust and invisible non-blind watermark for network flows,” in Proceedings of the Network and Distributed System Security Symposium, 406 –422 (2009). Google Scholar

[9]

Luby, M., “LT Codes,” in 43rd Symposium on Foundations of Computer Science (FOCS 2002), 271 –280 (2002). Google Scholar

[10]

Hua, J. and Cheng, W., “Progress in research on active network flow watermark,” Application Research of Computers, 37 (7), 1924 –1930 (2020). Google Scholar

[11]

Wang, X. and Reeves, D. S., “Robust correlation of encrypted attack traffic through stepping stones by manipulation of interpacket delays,” in Proceedings of the 10th ACM Conference on Computer and Communications Security, 20 –29 (2003). Google Scholar

[12]

Zhang, Y., “The LT code decoding algorithm,” HENAN SCIENCE, 1643 –1646 (2013). Google Scholar

[13]

Agha, K. A., Kadi, N. and Stojmenovic, I., “Fountain codes with XOR of encoded packets for broadcasting and source independent backbone in multi-hop networks using network coding,” in IEEE Vehicular Technology Conference, 1 –5 (2009). Google Scholar

[14]

Lei, W. J., Liu, H. F. and Xie, X. Z., “Switch degree distribution: an improved degree distribution for LT digital fountain code,” Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition), 34 –38 (2012). Google Scholar

[15]

Lei, W. J., Zhang, M., Xie, X. Z., “A design scheme for LT codes degree distribution by combining degree distributions and optimizing ripple size,” ACTA ELECTRONICA SINICA, 43 (4), 800 –805 (2015). Google Scholar

[16]

Yao, W., Yi, B., Huang, T. and Li, W., “Poisson robust soliton distribution (PRSD) for LT codes,” IEEE Communications Letters, 20 (8), 1499 –1502 (2016). https://doi.org/10.1109/LCOMM.2016.2578920 Google Scholar

Citation Download Citation

Sheng Gao and Lei Zhang "An optimal LT code for network flow watermarking", Proc. SPIE 12506, Third International Conference on Computer Science and Communication Technology (ICCSCT 2022), 1250614 (28 December 2022); https://doi.org/10.1117/12.2662345

Access the abstract

PROCEEDINGS
6 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Digital watermarking

1.

INTRODUCTION

2.

TECHNICAL OVERVIEW

2.1

Interval centroid based flow watermarking

2.2

LT code

3.

FRAMEWORK OF CODES-BASED NETWORK FLOW WATERMARKING

Figure 1.

4.

OPTIMAL DEGREE DISTRIBUTION

Figure 2.

5.

EXPERIMENTAL RESULTS AND ANALYSIS

Figure 3.

Figure 4.

6.

CONCLUSION

REFERENCES

Keywords/Phrases

Search In:

Publication Years