Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2903150.2911718acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
research-article

Large transfers for data analytics on shared wide-area networks

Published: 16 May 2016 Publication History

Abstract

One part of large-scale data analytics is the problem of transferring the data across wide-area networks (WANs). Often, the data must be gathered (e.g., from remote sites), processed, possibly transferred (e.g., for further processing), and then possibly disseminated. If the data-transfer stages are bottlenecks, the overall data analytics pipeline will be affected.
Although a variety of tools and protocols have been developed for large data transfers on WANs, most of the related work has been in the context of dedicated or non-shared networks. However, in practice, most networks are likely to be shared.
We consider and evaluate the problem of large data transfers on shared networks and large round-trip-times (RTT) as are found on many WANs. Using a variety of synthetic background network traffic (e.g., uniform, TCP, UDP, square waveform, bursty), we compare the performance of well-known protocols (e.g., GridFTP, UDT). On our emulated WAN network, both GridFTP and UDT perform well in all-TCP situations, but UDT performs better when UDP-based background traffic is prominent.

References

[1]
Worldwide lhc computing grid. http://home.cern/about/computing/worldwide-lhc-computing-grid, 2016-02-22.
[2]
A. Afanasyev, N. Tilley, P. Reiher, and L. Kleinrock. Host-to-host congestion control for tcp. Communications Surveys Tutorials, IEEE, 12(3):304--342, Third 2010.
[3]
W. Allcock, J. Bresnahan, R. Kettimuthu, M. Link, C. Dumitrescu, I. Raicu, and I. Foster. The globus striped gridftp framework and server. In Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, SC '05, pages 54--, Washington, DC, USA, 2005. IEEE Computer Society.
[4]
E. Altman, D. Barman, B. Tuffin, and M. Vojnovic. Parallel tcp sockets: Simple model, throughput and validation. In INFOCOM 2006. 25th IEEE International Conference on Computer Communications. Proceedings, pages 1--12, April 2006.
[5]
C. Barakat, E. Altman, and W. Dabbous. On tcp performance in a heterogeneous network: A survey. Comm. Mag., 38(1):40--46, Jan. 2000.
[6]
M. Carbone and L. Rizzo. Dummynet revisited. SIGCOMM Comput. Commun. Rev., 40(2):12--20, Apr. 2010.
[7]
E. Dart, L. Rotman, B. Tierney, M. Hester, and J. Zurawski. The science dmz: A network design pattern for data-intensive science. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC '13, pages 85:1--85:10, New York, NY, USA, 2013. ACM.
[8]
Y. Gu and R. L. Grossman. Udt: Udp-based data transfer for high-speed wide area networks. Computer Networks, 51(7):1777--1799, 2007. Protocols for Fast, Long-Distance Networks.
[9]
C. Guok, D. Robertson, M. Thompson, J. Lee, B. Tierney, and W. Johnston. Intra and interdomain circuit provisioning using the oscars reservation system. In Broadband Communications, Networks and Systems, 2006. BROADNETS 2006. 3rd International Conference on, pages 1--8, Oct 2006.
[10]
T. J. Hacker, B. D. Athey, and B. Noble. The end-to-end performance effects of parallel tcp sockets on a lossy wide-area network. In Proceedings of the 16th International Parallel and Distributed Processing Symposium, IPDPS '02, pages 314--, Washington, DC, USA, 2002. IEEE Computer Society.
[11]
E. He, J. Leigh, O. Yu, and T. A. DeFanti. Reliable blast udp: Predictable high performance bulk data transfer. In Proceedings of the IEEE International Conference on Cluster Computing, CLUSTER '02, pages 317--, Washington, DC, USA, 2002. IEEE Computer Society.
[12]
M. Honda, F. Huici, C. Raiciu, J. Araujo, and L. Rizzo. Rekindling network protocol innovation with user-level stacks. SIGCOMM Comput. Commun. Rev., 44(2):52--58, Apr. 2014.
[13]
R. Jain, D.-M. Chiu, and W. R. Hawe. A quantitative measure of fairness and discrimination for resource allocation in shared computer system, volume 38. Eastern Research Laboratory, Digital Equipment Corporation Hudson, MA, 1984.
[14]
R. Kettimuthu et al. Lessons learned from moving earth system grid data sets over a 20 gbps wide-area network. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC '10, pages 316--319, New York, NY, USA, 2010. ACM.
[15]
J. Kurose and K. Ross. Computer Networking: A Top-down Approach. Always learning. Pearson, 2013.
[16]
W. Liu, B. Tieman, R. Kettimuthu, and I. Foster. A data transfer framework for large-scale science experiments. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC '10, pages 717--724, New York, NY, USA, 2010. ACM.
[17]
L. Ramakrishnan, C. Guok, K. Jackson, E. Kissel, D. M. Swany, and D. Agarwal. On-demand overlay networks for large scientific data transfers. In Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, CCGRID '10, pages 359--367, Washington, DC, USA, 2010. IEEE Computer Society.
[18]
J. Roskind. Quic: Multiplexed stream transport over udp. Google working design document, 2013.
[19]
K. Winstein and H. Balakrishnan. Tcp ex machina: Computer-generated congestion control. In Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM, SIGCOMM '13, pages 123--134, New York, NY, USA, 2013. ACM.
[20]
S. Yu, N. Brownlee, and A. Mahanti. Comparative performance analysis of high-speed transfer protocols for big data. In Local Computer Networks (LCN), 2013 IEEE 38th Conference on, pages 292--295, Oct 2013.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CF '16: Proceedings of the ACM International Conference on Computing Frontiers
May 2016
487 pages
ISBN:9781450341288
DOI:10.1145/2903150
  • General Chairs:
  • Gianluca Palermo,
  • John Feo,
  • Program Chairs:
  • Antonino Tumeo,
  • Hubertus Franke
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 May 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data transfer
  2. fairness
  3. high-performance network
  4. shared network
  5. wide-area networks

Qualifiers

  • Research-article

Conference

CF'16
Sponsor:
CF'16: Computing Frontiers Conference
May 16 - 19, 2016
Como, Italy

Acceptance Rates

Overall Acceptance Rate 24 of 66 submissions, 36%

Upcoming Conference

CF '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)3
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Machine-Learned Recognition of Network Traffic for Optimization through Protocol SelectionComputers10.3390/computers1006007610:6(76)Online publication date: 11-Jun-2021
  • (2021)Active Probing for Improved Machine-Learned Recognition of Network TrafficMachine Learning for Networking10.1007/978-3-030-70866-5_8(122-140)Online publication date: 3-Mar-2021
  • (2020)Learning Mixed Traffic Signatures in Shared NetworksComputational Science – ICCS 202010.1007/978-3-030-50371-0_39(524-537)Online publication date: 15-Jun-2020
  • (2018)A Survey of End-System Optimizations for High-Speed NetworksACM Computing Surveys10.1145/318489951:3(1-36)Online publication date: 16-Jul-2018
  • (2018)Machine-Learned Classifiers for Protocol Selection on a Shared NetworkMachine Learning for Networking10.1007/978-3-030-19945-6_7(98-116)Online publication date: 27-Nov-2018
  • (2017)A Large Scale Data Transmission Control Mechanism across Data Centers2017 Fifth International Conference on Advanced Cloud and Big Data (CBD)10.1109/CBD.2017.12(20-25)Online publication date: Aug-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media