Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1555349.1555380acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
research-article

Maximum likelihood estimation of the flow size distribution tail index from sampled packet data

Published: 15 June 2009 Publication History

Abstract

In the context of network traffic analysis, we address the problem of estimating the tail index of flow (or more generally of any group) size distribution from the observation of a sampled population of packets (individuals). We give an exhaustive bibliography of the existing methods and show the relations between them. The main contribution of this work is then to propose a new method to estimate the tail index from sampled data, based on the resolution of the maximum likelihood problem. To assess the performance of our method, we present a full performance evaluation based on numerical simulations, and also on a real traffic trace corresponding to internet traffic recently acquired.

References

[1]
http://www.endace.com/.
[2]
Ipsumdump. http://www.cs.ucla.edu/~kohler/ipsumdump/.
[3]
P. Abry and D. Veitch. Wavelet analysis of long-range dependent traffic. IEEE Trans. on Info. Theory, 44(1):2--15, January 1998.
[4]
Chadi Barakat, Gianluca Iannaccone, and Christophe Diot. Ranking flows from sampled traffic. In CoNEXT '05: Proceedings of the 2005 ACM conference on Emerging network experiment and technology, pages 188--199, New York, NY, USA, 2005. ACM.
[5]
N.H. Bingham, C.M. Goldie, and J.L. Teugels. Regular variations. Cambridge University Press, Cambridge, UK, 1987.
[6]
Daniela Brauckhoff, Bernhard Tellenbach, Arno Wagner, Martin May, and Anukool Lakhina. Impact of packet sampling on anomaly detection metrics. In IMC '06: Proceedings of the 6th ACM SIGCOMM conference on Internet measurement, pages 159--164, New York, NY, USA, 2006. ACM.
[7]
Yousra Chabchoub, Christine Fricker, Fabrice Guillemin, and Philippe Robert. Deterministic versus probabilistic packet sampling in the internet. In ITC'20, June 2007.
[8]
Yousra Chabchoub, Christine Fricker, Fabrice Guillemin, and Philippe Robert. A robust statistical estimation of internet traffic. preprint, June 2008.
[9]
Cisco. Netflow. see http://www.cisco.com/en/us/products/ps6601/~products_ios_protocol_group_home.html.
[10]
Aaron Clauset, Cosma Rohilla Shalizi, and M.E.J. Newman. Power-law distributions in empirical data, June 2007. arXiv:0706.1062v1.
[11]
Mark E. Crovella and Azer Bestavros. Self-similarity in world wide web traffic: evidence and possible causes. IEEE/ACM Trans. Netw., 5(6):835--846, 1997.
[12]
Mark E. Crovella and Murad S. Taqqu. Estimating the heavy tail index from scaling properties. Methodology and Computing in Applied Probability, 1(1):55--79, July 1999.
[13]
A.P. Dempster, N.M. Laird, and D.B. Rubin. Maximum likelihood from incomplete data via the em algorithm. J. Roy. Stat. Soc., Series B (Method.), 1977.
[14]
Nick Duffield, Carsten Lund, and Mikkel Thorup. Estimating flow distributions from sampled flow statistics. In SIGCOMM, 2003.
[15]
Cristian Estan and George Varghese. New directions in traffic measurement and accounting. In SIGCOMM, 2002.
[16]
William Feller. An introduction to probability theory and its applications, volume II. John Wiley & Sons, third edition, 1971.
[17]
Paulo Gonçalves and Rudolf Riedi. Diverging moments and parameter estimation. J. American Stat. Assoc., 100(472):1382--1393, December 2005.
[18]
Bruce M. Hill. A simple general approach to inference about the tail of a distribution. The Annals of Statistics, 3(5):1163--1174, September 1975.
[19]
Nicolas Hohn and Darryl Veitch. Inverting sampled traffic. IEEE/ACM Trans. Netw., 14(1):68--80, 2006.
[20]
Ryoichi Kawahara, Tatsuya Mori, Noriaki Kamiyama, Shigeaki Harada, and Shoichiro Asano. A study on detecting network anomalies using sampled flow statistics. In SAINT-W '07: Proceedings of the 2007 International Symposium on Applications and the Internet Workshops, page 81, Washington, DC, USA, 2007. IEEE Computer Society.
[21]
Weijiang Liu, Jian Gong, Wei Ding, and Guang Cheng. A method for estimation of flow length distributions from sampled flow statistics. In ICOIN, 2006.
[22]
Patrick Loiseau, Paulo Gonçalves, and Pascale Primet Vicat-Blanc. A comparative study of different heavy tail index estimators of the flow size from sampled data. In MetroGrid Workshop, GridNets, New York, USA, October 2007. ACM Press.
[23]
Michel Mandjes and Nam Kyoo Boots. The shape of the loss curve and the impact of long-range dependence on network performance. Tinbergen Institute Discussion Papers 01-051/4, Tinbergen Institute, May 2001.
[24]
Tatsuya Mori, Masato Uchida, Ryoichi Kawahara, Jianping Pan, and Shigeki Goto. Identifying elephant flows through periodically sampled packets. In IMC '04: Proceedings of the 4th ACM SIGCOMM conference on Internet measurement, pages 115--120, New York, NY, USA, 2004. ACM.
[25]
J.P. Nolan. Maximum likelihood estimation and diagnostics for stable distributions. In O. Barndorff-Nielsen, T. Mikosh, and S. Resnick, editors, Lévy Processes: Theory and application. Birkhäuser, Boston, 2001.
[26]
K. Park, G. Kim, and M. Crovella. On the relationship between file sizes, transport protocols, and self-similar network traffic. In Int. Conf. on Network Protocols, page 171, Washington, DC, USA, 1996. IEEE Computer Society.
[27]
Kihong Park, Gitae Kim, and Mark Crovella. On the effect of traffic self-similarity on network performances. In SPIE International Conference on Performance and Control of Network Systems, November 1997.
[28]
Bruno Ribeiro, Don Towsley, Tao Ye, and Jean C. Bolot. Fisher information of sampled packets: an application to flow size estimation. In IMC '06: Proceedings of the 6th ACM SIGCOMM conference on Internet measurement, pages 15--26, New York, NY, USA, 2006. ACM.
[29]
Antoine Scherrer, Nicolas Larrieu, Philippe Owezarski, Pierre Borgnat, and Patrice Abry. Non-gaussian and long memory statistical characterizations for internet traffic with anomalies. IEEE Trans. Dependable Secur. Comput., 4(1):56--70, 2007.
[30]
H.L. Seal. The maximum likelihood fitting of the discrete Pareto law. Journal of the Institute of Actuaries, 78:115--121, 1952.
[31]
Murad S. Taqqu, Walter Willinger, and Robert Sherman. Proof of a fundamental result in self-similar traffic modeling. SIGCOMM Comput. Commun. Rev., 27(2):5--23, 1997.
[32]
Paul Tune and Darryl Veitch. Towards optimal sampling for flow size estimation. In IMC '08: Proceedings of the 8th ACM SIGCOMM conference on Internet measurement, pages 243--256, New York, NY, USA, 2008. ACM.
[33]
Lili Yang and George Michailidis. Sampled based estimation of network traffic flow characteristics. In INFOCOM, May 2007.

Cited By

View all
  • (2023)Panakos: Chasing the Tails for Multidimensional Data StreamsProceedings of the VLDB Endowment10.14778/3583140.358314716:6(1291-1304)Online publication date: 1-Feb-2023
  • (2019)On the statistical characterization of flows in Internet traffic with application to samplingComputer Communications10.1016/j.comcom.2009.08.00633:1(103-112)Online publication date: 4-Jan-2019
  • (2018)Meeting Deadlines in Datacenter Networks: An Analysis on Deadline-Aware Transport Layer Protocols2018 International Conference on Computing and Network Communications (CoCoNet)10.1109/CoCoNet.2018.8476888(152-158)Online publication date: Aug-2018
  • Show More Cited By

Index Terms

  1. Maximum likelihood estimation of the flow size distribution tail index from sampled packet data

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGMETRICS '09: Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
      June 2009
      336 pages
      ISBN:9781605585116
      DOI:10.1145/1555349
      • cover image ACM SIGMETRICS Performance Evaluation Review
        ACM SIGMETRICS Performance Evaluation Review  Volume 37, Issue 1
        SIGMETRICS '09
        June 2009
        320 pages
        ISSN:0163-5999
        DOI:10.1145/2492101
        Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 15 June 2009

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. expectation-maximization algorithm
      2. heavy-tailed distribution
      3. maximum likelihood estimation
      4. network monitoring
      5. packet sampling
      6. traffic measurement

      Qualifiers

      • Research-article

      Conference

      SIGMETRICS09

      Acceptance Rates

      Overall Acceptance Rate 459 of 2,691 submissions, 17%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 20 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Panakos: Chasing the Tails for Multidimensional Data StreamsProceedings of the VLDB Endowment10.14778/3583140.358314716:6(1291-1304)Online publication date: 1-Feb-2023
      • (2019)On the statistical characterization of flows in Internet traffic with application to samplingComputer Communications10.1016/j.comcom.2009.08.00633:1(103-112)Online publication date: 4-Jan-2019
      • (2018)Meeting Deadlines in Datacenter Networks: An Analysis on Deadline-Aware Transport Layer Protocols2018 International Conference on Computing and Network Communications (CoCoNet)10.1109/CoCoNet.2018.8476888(152-158)Online publication date: Aug-2018
      • (2016)Estimation of Flow Distributions from Sampled TrafficACM Transactions on Modeling and Performance Evaluation of Computing Systems10.1145/28911061:3(1-28)Online publication date: 7-May-2016
      • (2014)Scalable algorithms for estimating flow length distributions from sampled dataComputing10.1007/s00607-014-0386-996:6(527-543)Online publication date: 1-Jun-2014
      • (2013)Timeout strategy of sampled flow in high-speed networksProceedings 2013 International Conference on Mechatronic Sciences, Electric Engineering and Computer (MEC)10.1109/MEC.2013.6885390(2061-2065)Online publication date: Dec-2013
      • (2012)Inverting flow durations from sampled trafficProceedings of the 24th International Teletraffic Congress10.5555/2414276.2414285(1-8)Online publication date: 4-Sep-2012
      • (2012)Measurement-Aware Monitor Placement and Routing: A Joint Optimization Approach for Network-Wide MeasurementsIEEE Transactions on Network and Service Management10.1109/TNSM.2012.010912.1101289:1(48-59)Online publication date: Mar-2012
      • (2012)Estimation of flow distributions tails from sampled traffic2012 IEEE Statistical Signal Processing Workshop (SSP)10.1109/SSP.2012.6319825(796-799)Online publication date: Aug-2012
      • (2011)Spectral Models for Bitrate Measurement from Packet Sampled TrafficIEEE Transactions on Network and Service Management10.1109/TNSM.2011.050311.1000358:2(141-152)Online publication date: Jun-2011
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media