Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1644893.1644918acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
research-article

The nature of data center traffic: measurements & analysis

Published: 04 November 2009 Publication History

Abstract

We explore the nature of traffic in data centers, designed to support the mining of massive data sets. We instrument the servers to collect socket-level logs, with negligible performance impact. In a 1500 server operational cluster, we thus amass roughly a petabyte of measurements over two months, from which we obtain and report detailed views of traffic and congestion conditions and patterns. We further consider whether traffic matrices in the cluster might be obtained instead via tomographic inference from coarser-grained counter data.

References

[1]
AmazonWeb Services. http://aws.amazon.com.
[2]
EventTracing forWindows. http://msdn.microso.com/en-us/library/ms751538.aspx.
[3]
Google app engine. http://code.google.com/appengine/.
[4]
Hadoop distributed filesystem. http://hadoop.apache.org.
[5]
Windows Azure. http://www.microso.com/azure/.
[6]
L. A. Barroso and U. Hölzle. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Synthesis Lectures on Computer Architecture, 2009.
[7]
T. Benson, A. Anand, A. Akella, and M. Zhang. Understanding Datacenter Traffic Characteristics. In SIGCOMM WREN workshop, 2009.
[8]
R. Chaiken, B. Jenkins, P. Åke Larson, B. Ramsey, D. Shakib, S. Weaver, and J. Zhou. SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets. In VLDB, 2008.
[9]
F. Chang, J. Dean, S. Ghemawat, W. Hsieh, D. A.Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: a distributed storage system for structured data. In OSDI, 2006.
[10]
Y. Chen, R. Griffith, J. Liu, R. H. Katz, and A. D. Joseph. Understanding TCP Incast throughput Collapse in Datacenter Networks. In SIGCOMM WREN Workshop, 2009.
[11]
Cisco Guard DDoS Mitigation Appliance. http://www.cisco.com/en/US/products/ps5888/.
[12]
Cisco Nexus 7000 Series Switches. http://www.cisco.com/en/US/products/ps9402/.
[13]
C. Cranor, T. Johnson, O. Spataschek, and V. Shkapenyuk. Gigascope: A stream database for network applications. In SIGMOD, 2003.
[14]
J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. In OSDI, 2004.
[15]
N. Duffield, C. Lund, and M. Thorup. Estimating Flow Distributions from Sampled Flow Statistics. In SIGCOMM, 2003.
[16]
C. Estan, K. Keys, D. Moore, and G. Varghese. Building a Better NetFlow. In SIGCOMM, 2004.
[17]
S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google File System. In SOSP, 2003.
[18]
A. Greenberg, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. Maltz, P. Patel, and S. Sengupta. VL2: A Scalable and Flexible Data Center Network. In ACM SIGCOMM, 2009.
[19]
S. Guha, J. Chandrashekar, N. Taft, and K. Papagiannaki. How Healthy are Today's Enterprise Networks? In IMC, 2008.
[20]
A. Gunnar, M. Johansson, and T. Telkampi. Traffic Matrix Estimation on a Large IP Backbone - A Comparison on Real Data. In IMC, 2004.
[21]
L. Huang, X. Nguyen, M. Garofalakis, J. Hellerstein, M. Jordan, M. Joseph, and N. Taft. Communication-Efficient Online Detection of Network-Wide Anomalies. In INFOCOM, 2007.
[22]
IETF Working Group IP Flow Information Export (ipfix). http://www.ietf.org/html.charters/ipfix-charter.html.
[23]
M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks. In EUROSYS, 2007.
[24]
T. Karagiannis, R. Mortier, and A. Rowstron. Network exception handlers: Host-Network Control in Enterprise Networks. In SIGCOMM, 2008.
[25]
M. Kodialam, T. V. Lakshman, and S. Sengupta. Efficient and Robust Routing of Highly Variable Traffic. In HotNets, 2004.
[26]
R. Kompella and C. Estan. Power of Slicing in Internet Flow Measurement. In IMC, 2005.
[27]
C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig Latin: A Not-So-Foreign Language for Data Processing. In SIGMOD, 2008.
[28]
R. Pang, M. Allman, M. Bennett, J. Lee, V. Paxson, and B. Tierney. A First Look at Modern Enterprise Traffic. In IMC, 2005.
[29]
IETF Packet Sampling (ActiveWG). http://tools.ietf.org/wg/psamp/.
[30]
S. Kandula and D. Katabi and S. Sinha and A. Berger. Dynamic Load Balancing Without Packet Reordering. In CCR, 2006.
[31]
sFlow.org. Making the network visible. http://www.sflow.org.
[32]
A. Soule, A. Lakhina, N. Taft, K. Papagiannaki, K. Salamatian, A. Nucci, M. Crovella, and C. Diot. Traffic Matrices: Balancing Measurements, Inference and Modeling. In ACM SIGMETRICS, 2005.
[33]
V. Vasudevan, A. Phanishayee, H. Shah, E. Krevat, D. Andersen, G. Ganger, G. Gibson, and B. Mueller. Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication. In SIGCOMM, 2009.
[34]
Y. Zhang, Z. Ge, A. Greenberg, and M. Roughan. Network Anomography. In IMC, 2005.
[35]
Y. Zhang, M. Roughan, N. C. Duffield, and A. Greenberg. Fast accurate computation of large-scale IP traffic matrices from link loads. In ACM SIGMETRICS, 2003.
[36]
R. Zhang-Shen and N. McKeown. Designing a Predictable Internet Backbone Network. In HotNets, 2004.

Cited By

View all
  • (2024)Dynamic capacity sharing with multi-wavelength integrated transmitters in hybrid datacenter networksJournal of Optical Communications and Networking10.1364/JOCN.52844316:10(990)Online publication date: 19-Sep-2024
  • (2024)Halflife: An Adaptive Flowlet-based Load Balancer with Fading Timeout in Data Center NetworksProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650062(66-81)Online publication date: 22-Apr-2024
  • (2024)Understanding the Impact of Arbitration in MZI-Based Beneš Switching FabricsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.333670335:2(338-348)Online publication date: Feb-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
IMC '09: Proceedings of the 9th ACM SIGCOMM conference on Internet measurement
November 2009
468 pages
ISBN:9781605587714
DOI:10.1145/1644893
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 November 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. characterization
  2. data center traffic
  3. models
  4. tomography

Qualifiers

  • Research-article

Conference

IMC '09
Sponsor:
IMC '09: Internet Measurement Conference
November 4 - 6, 2009
Illinois, Chicago, USA

Acceptance Rates

Overall Acceptance Rate 277 of 1,083 submissions, 26%

Upcoming Conference

IMC '24
ACM Internet Measurement Conference
November 4 - 6, 2024
Madrid , AA , Spain

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)163
  • Downloads (Last 6 weeks)22
Reflects downloads up to 30 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Dynamic capacity sharing with multi-wavelength integrated transmitters in hybrid datacenter networksJournal of Optical Communications and Networking10.1364/JOCN.52844316:10(990)Online publication date: 19-Sep-2024
  • (2024)Halflife: An Adaptive Flowlet-based Load Balancer with Fading Timeout in Data Center NetworksProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650062(66-81)Online publication date: 22-Apr-2024
  • (2024)Understanding the Impact of Arbitration in MZI-Based Beneš Switching FabricsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.333670335:2(338-348)Online publication date: Feb-2024
  • (2024)BurstBalancer: Do Less, Better Balance for Large-Scale Data Center TrafficIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.329545435:6(932-949)Online publication date: Jun-2024
  • (2024)Online Elephant Flow Prediction for Load Balancing in Programmable Switch-Based DCNIEEE Transactions on Network and Service Management10.1109/TNSM.2023.331875221:1(745-758)Online publication date: Feb-2024
  • (2024)Load Profiling via In-Band Flow Classification and P4 With HowdahIEEE Transactions on Network and Service Management10.1109/TNSM.2023.329972921:1(295-309)Online publication date: Feb-2024
  • (2024)Joint Optimization of Measurement Point Intelligent Selection and End-to-End Network Traffic Calculation in DatacentersIEEE Transactions on Network Science and Engineering10.1109/TNSE.2023.327868011:3(2438-2449)Online publication date: May-2024
  • (2024)Performances of Traffic Offloading in Data Center Networks With Steerable Free-Space Optical CommunicationsIEEE/ACM Transactions on Networking10.1109/TNET.2023.334071332:3(2189-2204)Online publication date: Jun-2024
  • (2024)Achieving Cost Optimization for Tenant Task Placement in Geo-Distributed CloudsIEEE/ACM Transactions on Networking10.1109/TNET.2023.331943432:2(1391-1406)Online publication date: Apr-2024
  • (2024)Learning to Configure Converters in Hybrid Switching Data Center NetworksIEEE/ACM Transactions on Networking10.1109/TNET.2023.329480332:1(520-534)Online publication date: Feb-2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media