Abstract
Detecting the flows with abnormally large spreads over big network data can help us identify network attacks, such as DDoS attacks and scanners. Most per-flow measurement studies use compact data structures to reduce their memory requirements, fitting in the limited on-chip memory and catching up with the line rate. In this paper, we study a novel problem called spread estimation among multi-periods to measure the total number of distinct elements or the number of distinct k-persistent elements in a flow among multiple traffic measurement periods. In our design, we use an on-chip/off-chip model to record the per-flow traffic information, which uses small on-chip memory and matches the line rate, i.e., we use on-chip memory to filter out the duplicates, sample the elements, and store the sampled traffic data in off-chip memory. By performing the set operations on the sampled traffic data, we can derive the total number of distinct elements and the number of distinct k-persistent elements among multiple periods based on probability analysis. The experimental results on real Internet traffic traces show that, when performing spread estimation among multiple periods, our estimator is efficient in memory usage and estimation accuracy and can efficiently detect the stealthy DDoS attack and scanners.
Similar content being viewed by others
References
Estan C, Varghese G (2003) New Directions in Traffic Measurement and Accounting: Focusing on the Elephants, Ignoring the Mice. ACM Trans Comput Syst 21(3):270–313
Heule S, Nunkesser M, Hall A (2013) HyperLogLog in Practice: Algorithmic Engineering of a State of the Art Cardinality Estimation Algorithm. In: Proceedings of EDBT, pp 683–692
Lieven P, Scheuermann B (2010) High-Speed Per-Flow Traffic Measurement with Probabilistic Multiplicity Counting. In: Proceedings of IEEE INFOCOM, pp 1–9
Yoon M, Li T, Chen S, Peir J (2009) Fit a Spread Estimator in Small Memory. In: Proceedings of IEEE INFOCOM, pp 504–512
Yoon M, Kim Y J (2019) Address Block Counting Using Two-Tier Cardinality Estimation. IEEE Access 7:125754–125761
Jeong J, Naqvi S M A, Yoon M (2018) Accurate and Communication-Efficient Detection of Widespread Events. IEEE Access 6:61728–61734
Lu Y, Montanari A, Prabhakar B, Dharmapurikar S, Kabbani A (2008) Counter Braids: A Novel Counter Architecture for per-Flow Measurement. ACM SIGMETRICS Perform Eval Rev 36(1):121–132
Zhou Y, Zhou Y, Chen M, Xiao Q, Chen S (2016) Highly Compact Virtual Counters for Per-Flow Traffic Measurement through Register Sharing. In: Proceedings of IEEE GLOBECOM, pp 1–6
Zhou Y, Zhou Y, Chen S, Youlin Zhang (2017) Per-flow counting for big network data stream over sliding windows. In: Proceedings of IEEE/ACM IWQoS, pp 1–10
Zhou Y, Zhou Y, Chen S, Zhang Y (2018) Highly Compact Virtual Active Counters for Per-flow Traffic Measurement. In: Proceedings of IEEE INFOCOM, pp 1–9
Wang S, Wang S, Zhou D, Yang Y, Zhang W, Huang T, Huo R, Liu Y (2020) Large-scale and rapid flow size estimation for improving flow scheduling. In: IEEE INFOCOM 2020 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp 1141–1146
Yang T, Gao S, Sun Z, Wang Y, Shen Y, Li X (2019Dec) Diamond sketch: Accurate per-flow measurement for big streaming data. IEEE Trans Parallel Distrib Syst 30(12):2650–2662
Dimitropoulos X, Hurley P, Kind A (2008) Probabilistic Lossy Counting: An Efficient Algorithm for Finding Heavy Hitters. ACM SIGCOMM Comput Commun Rev 38(1):5
Zhang Y, Singh S, Sen S, Duffield N, Lund C (2004) Online Identification of Hierarchical Heavy Hitters: Algorithms, Evaluation, and Applications. In: Proceedings of ACM IMC, pp 101–114
Liu Z, Manousis A, Vorsanger G, Sekar V, Braverman V (2016) One Sketch to Rule Them All: Rethinking Network Flow Monitoring with UnivMon. In: Proceedings of ACM SIGCOMM, pp 101–114
Zhou Y, Zhang Y, Ma C, Chen S, Odegbile O O (2019) Generalized sketch families for network traffic measurement. Proc ACM Meas Anal Comput Syst 3:3
Cohen R, Nezri Y (2019) Cardinality estimation in a virtualized network device using online machine learning. IEEE/ACM Trans Netw 27(5):2098–2110
Kumar A, Xu J, Wang J (2006) Space-Code Bloom Filter for Efficient Per-Flow Traffic Measurement. IEEE J Sel Areas Commun 24(12):2327–2339
Hao F, Kodialam M, Lakshman T V (2004) ACCEL-RATE: A Faster Mechanism for Memory Efficient per-Flow Traffic Estimation. ACM SIGMETRICS Perform Eval Rev 32(1):155–166
Bhuyan M H, Bhattacharyya D K, Kalita J K (2014) Network Anomaly Detection: Methods, Systems and Tools. IEEE Commun Surv Tutorials 16(1):303–336
Sperotto A, Schaffrath G, Sadre R, Morariu C, Pras A, Stiller B (2010) An Overview of IP Flow-Based Intrusion Detection. IEEE Commun Surv Tutorials 12(3):343–356
Zhao Q, Xu J, Kumar A (2006) Detection of Super Sources and Destinations in High-Speed Networks: Algorithms, Analysis and Evaluation. IEEE J Sel Areas Commun 24 (10):1840– 1852
Xiao Q, Qiao Y, Zhen M, Chen S (2014) Estimating the Persistent Spreads in High-Speed Networks. In: Proceedings of IEEE ICNP, pp 131–142
Huang H, Sun Y, Chen S, Tang S, Han K, Yuan J, Yang W (2018) You Can Drop but You Can’t Hide: K-persistent Spread Estimation in High-speed Networks. In: Proceedings of IEEE INFOCOM, pp 1889–1897
Marold A, Lieven P, Scheuermann B (2011) Distributed Probabilistic Network Traffic Measurements. In: Proceedings of KiVS, vol 17. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, pp 133–144
Yoon M, Li T, Chen S, Peir J (2011) Fit a Compact Spread Estimator in Small High-Speed Memory. IEEE/ACM Trans Netw 19(5):1253–1264
Huang H, Sun Y-E, Ma C, Chen S, Zhou Y, Yang W, Tang S, Xu H, Qiao Y (2020) An efficient k-persistent spread estimator for traffic measurement in high-speed networks. IEEE/ACM Trans Networking
Xiao Q, Chen S, Chen M, Ling Y (2015) Hyper-Compact Virtual Estimators for Big Network Data Based on Register Sharing. In: Proceedings of ACM SIGMETRICS, pp 417–428
Zhou Y, Zhou Y, Chen M, Chen S (2017) Persistent Spread Measurement for Big Network Data Based on Register Intersection. Proc ACM Measur Anal Comput Syst 1(1):1–29
Cisco sampled netflow. http://www.cisco.com
Mai J, Chuah C-N, Sridharan A, Ye T, Zang H (2006) Is Sampled Data Sufficient for Anomaly Detection?. In: Proceedings of ACM IMC, pp 165–176
Estan C, Varghese G (2003) New Directions in Traffic Measurement and Accounting: Focusing on the Elephants, Ignoring the Mice. ACM Trans Comput Syst 21(3):270–313
Mo Z, Qiao Y, Chen S, Li T (2014) Highly compact virtual maximum likelihood sketches for counting big network data. In: Proceedings of Allerton, pp 1188–1195
Sun Y, Huang H, Ma C, Chen S, Du Y, Xiao Q (2020) Online Spread Estimation with Non-duplicate Samplingv
CAIDA The CAIDA UCSD Anonymized Internet Traces 2016. http://www.caida.org/data/passive/passive_2016_dataset.xml, Accessed July 28, 2019
Acknowledgments
The research of authors is supported by the National Natural Science Foundation of China (NSFC) under Grant No. 62072322, Grant No. 61873177, and Liaoning Provincial Natural Science Foundation of China under Grant No. 20180550014.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
He Huang and Yang Du made equal contributions to this work.
This article belongs to the Topical Collection: Special Issue on Privacy-Preserving Computing
Guest Editors: Kaiping Xue, Zhe Liu, Haojin Zhu, Miao Pan and David S.L. Wei
Rights and permissions
About this article
Cite this article
Bu, X., Sun, YE., Du, Y. et al. A novel spread estimation based abnormal flow detection in high-speed networks. Peer-to-Peer Netw. Appl. 14, 1401–1413 (2021). https://doi.org/10.1007/s12083-020-01036-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12083-020-01036-8