Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1140277.1140295acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
Article

Data streaming algorithms for estimating entropy of network traffic

Published: 26 June 2006 Publication History

Abstract

Using entropy of traffic distributions has been shown to aid a wide variety of network monitoring applications such as anomaly detection, clustering to reveal interesting patterns, and traffic classification. However, realizing this potential benefit in practice requires accurate algorithms that can operate on high-speed links, with low CPU and memory requirements. In this paper, we investigate the problem of estimating the entropy in a streaming computation model. We give lower bounds for this problem, showing that neither approximation nor randomization alone will let us compute the entropy efficiently. We present two algorithms for randomly approximating the entropy in a time and space efficient manner, applicable for use on very high speed (greater than OC-48) links. The first algorithm for entropy estimation is inspired by the structural similarity with the seminal work of Alon et al. for estimating frequency moments, and we provide strong theoretical guarantees on the error and resource usage. Our second algorithm utilizes the observation that the performance of the streaming algorithm can be enhanced by separating the high-frequency items (or elephants) from the low-frequency items (or mice). We evaluate our algorithms on traffic traces from different deployment scenarios.

References

[1]
A. Chakrabarti, K. Do Ba, and S. Muthukrishnan. Estimating entropy and entropy norm on data streams. In Proceedings of the 23rd International Symposium on Theoretical Aspects of Computer Science (STACS), 2006.
[2]
N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. In Proceedings of ACM Symposium on Theory of Computing (STOC), 1996.
[3]
P. Barford, J. Kline, D. Plonka, and A. Ron. A Signal Analysis of Network Traffic Anomalies. In Proceedings of ACM SIGCOMM Internet Measurement Workshop (IMW), 2002.
[4]
J. D. Brutlag. Aberrant behavior detection in time series for network monitoring. In Proceedings of USENIX Large Installation System Administration Conference (LISA), 2000.
[5]
N. Duffield, C. Lund, and M. Thorup. Estimating flow distributions from sampled flow statistics. In Proceedings of ACM SIGCOMM, 2003.
[6]
C. Estan and G. Varghese. New directions in traffic measurement and accounting. In Proceedings of ACM SIGCOMM, 2002.
[7]
L. Feinstein, D. Schnackenberg, R. Balupari, and D. Kindred. Statistical approaches to DDoS attack detection and response. In Proceedings of the DARPA Information Survivability Conference and Exposition, 2003.
[8]
S. Guha, A. McGregor, and S. Venkatasubramanian. Streaming and sublinear approximation of entropy and information distances. In Proceedings of ACM Symposium on Discrete Algorithms (SODA), 2006.
[9]
F. Hao, M. Kodialam, and T. V. Lakshman. ACCEL-RATE: a faster mechanism for memory efficient per-flow traffic estimation. In Proceedings of ACM SIGMETRICS, 2004.
[10]
N. Hohn and D. Veitch. Inverting sampled traffic. In Proceedings of ACM/USENIX Internet Measurement Conference (IMC), 2003.
[11]
B. Kalyanasundaram and G. Schnitger. The probabilistic communication complexity of set intersection. SIAM Journal on Discrete Mathematics, 5(4):545--557, 1992.
[12]
V. Karamcheti, D. Geiger, Z. Kedem, and S. Muthukrishnan. Detecting malicious network traffic using inverse distributions of packet contents. In Proceedings of ACM SIGCOMM Workshop on Mining Network Data (MineNet), 2005.
[13]
A. Kumar, M. Sung, J. Xu, and J. Wang. Data streaming algorithms for efficient and accurate estimation of flow distribution. In Proceedings of ACM SIGMETRICS/IFIP WG 7.3 Performance, 2004.
[14]
A. Kumar, M. Sung, J. Xu, and E. Zegura. A data streaming algorithm for estimating subpopulation flow size distribution. In Proceedings of ACM SIGMETRICS, 2005.
[15]
E. Kushilevitz and N. Nisan. Communication complexity. Cambridge University Press, New York, NY, USA, 1997.
[16]
A. Lakhina, M. Crovella, and C. Diot. Diagnosing network-wide traffic anomalies. In Proceedings of ACM SIGCOMM, 2004.
[17]
A. Lakhina, M. Crovella, and C. Diot. Mining anomalies using traffic feature distributions. In Proceedings of ACM SIGCOMM, 2005.
[18]
K. Levchenko, R. Paturi, and G. Varghese. On the difficulty of scalably detecting network attacks. In Proceedings of ACM Conference on Computer and Communications Security (CCS), 2004.
[19]
S. Muthukrishnan. Data streams: algorithms and applications. http://athos.rutgers.edu/~muthu/stream-1-1.ps.
[20]
Cisco Netflow. http://www.cisco.com/warp/public/732/Tech/nmp/netflow/index.shtml.
[21]
M. Roughan, A. Greenberg, C. Kalmanek, M. Rumsewicz, J. Yates, and Y. Zhang. Experience in measuring internet backbone traffic variability: Models, metrics, measurements and meaning. In Proceedings of International Teletraffic Congress (ITC), 2003.
[22]
S. Venkataraman, D. Song, P. B. Gibbons, and A. Blum. New Streaming Algorithms for Fast Detection of Superspreaders . In Proceedings of Network and Distributed System Security Symposium (NDSS), 2005.
[23]
A. Wagner and B. Plattner. Entropy Based Worm and Anomaly Detection in Fast IP Networks. In Proceedings of IEEE International Workshop on Enabling Technologies, Infrastructures for Collaborative Enterprises, 2005.
[24]
K. Xu, Z.-L. Zhang, and S. Bhattacharya. Profiling internet backbone traffic: Behavior models and applications. In Proceedings of ACM SIGCOMM, 2005.
[25]
Y. Zhang, Z. Ge, M. Roughan, and A. Greenberg. Network anomography. In Proceedings of ACM/USENIX Internet Measurement Conference (IMC), 2005.
[26]
Y. Zhang, S. Singh, S. Sen, N. Duffield, and C. Lund. Online identification of hierarchical heavy hitters: algorithms, evaluations, and applications. In Proceedings of ACM/USENIX Internet Measurement Conference (IMC), 2004.

Cited By

View all
  • (2024)A Framework for Anomaly Detection in Blockchain Networks With SketchesIEEE/ACM Transactions on Networking10.1109/TNET.2023.329825332:1(686-698)Online publication date: Feb-2024
  • (2024)Windower: Feature Extraction for Real-Time DDoS Detection Using Machine LearningNOMS 2024-2024 IEEE Network Operations and Management Symposium10.1109/NOMS59830.2024.10575699(1-10)Online publication date: 6-May-2024
  • (2023)CocoSketch: High-Performance Sketch-Based Measurement Over Arbitrary Partial Key QueryIEEE/ACM Transactions on Networking10.1109/TNET.2023.325722631:6(2653-2668)Online publication date: Dec-2023
  • Show More Cited By

Index Terms

  1. Data streaming algorithms for estimating entropy of network traffic

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMETRICS '06/Performance '06: Proceedings of the joint international conference on Measurement and modeling of computer systems
    June 2006
    404 pages
    ISBN:1595933190
    DOI:10.1145/1140277
    • cover image ACM SIGMETRICS Performance Evaluation Review
      ACM SIGMETRICS Performance Evaluation Review  Volume 34, Issue 1
      Performance evaluation review
      June 2006
      388 pages
      ISSN:0163-5999
      DOI:10.1145/1140103
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 June 2006

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. data streaming
    2. traffic analysis

    Qualifiers

    • Article

    Conference

    SIGMETRICS06

    Acceptance Rates

    Overall Acceptance Rate 459 of 2,691 submissions, 17%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)45
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 12 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Framework for Anomaly Detection in Blockchain Networks With SketchesIEEE/ACM Transactions on Networking10.1109/TNET.2023.329825332:1(686-698)Online publication date: Feb-2024
    • (2024)Windower: Feature Extraction for Real-Time DDoS Detection Using Machine LearningNOMS 2024-2024 IEEE Network Operations and Management Symposium10.1109/NOMS59830.2024.10575699(1-10)Online publication date: 6-May-2024
    • (2023)CocoSketch: High-Performance Sketch-Based Measurement Over Arbitrary Partial Key QueryIEEE/ACM Transactions on Networking10.1109/TNET.2023.325722631:6(2653-2668)Online publication date: Dec-2023
    • (2023)A One-Pass Clustering Based Sketch Method for Network MonitoringIEEE/ACM Transactions on Networking10.1109/TNET.2023.325198131:6(2604-2613)Online publication date: Dec-2023
    • (2023)Universal and Accurate Sketch for Estimating Heavy Hitters and Moments in Data StreamsIEEE/ACM Transactions on Networking10.1109/TNET.2022.321602531:5(1919-1934)Online publication date: Oct-2023
    • (2022)Sketch-based entropy estimationProceedings of the 5th International Workshop on P4 in Europe10.1145/3565475.3569082(57-60)Online publication date: 9-Dec-2022
    • (2022)Tabular Interpolation Approach Based on Stable Random Projection for Estimating Empirical Entropy of High-Speed Network TrafficIEEE Access10.1109/ACCESS.2022.321033610(104934-104953)Online publication date: 2022
    • (2021)CocoSketchProceedings of the 2021 ACM SIGCOMM 2021 Conference10.1145/3452296.3472892(207-222)Online publication date: 9-Aug-2021
    • (2021)Efficient Forwarding Anomaly Detection in Software-Defined NetworksIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.306813532:11(2676-2690)Online publication date: 1-Nov-2021
    • (2021)Jellyfish: Locality-Sensitive Subflow SketchingIEEE INFOCOM 2021 - IEEE Conference on Computer Communications10.1109/INFOCOM42981.2021.9488847(1-10)Online publication date: 10-May-2021
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media