End-to-End Data-Flow Parallelism for Throughput Optimization in High-Speed Networks

Esma Yildirim¹ &
Tevfik Kosar²

217 Accesses
13 Citations
Explore all metrics

Abstract

The increase in the data produced by large-scale scientific applications necessitates innovative solutions for efficient transfer of data. Although the current optical networking technology reached theoretical speeds of 100 Gbps, applications still suffer from inefficient transport protocols and bottlenecks on the end-systems (e.g. disk, CPU, NIC). High-performance systems provide us with parallel disks, processors and network interfaces. However the lack of orchestration of these end-system resources with the available network capacity results in underutilization of the network bandwidth. In this study, a model and two algorithms that use ‘end-to-end data-flow parallelism’ to optimize the use of network and end-system resources are proposed. This is achieved by using multiple parallel streams over the network; and multiple parallel disks and CPUs at the end systems. Our model predicts the optimal number of streams and disk/CPU stripes that maximizes the data transfer speed for any setting. Our algorithms use GridFTP parallel samplings and calculate the optimal level of parallelism based on our prediction model. The experiments conducted by using actual GridFTP transfers show that the predictions performed by our model and algorithms provide close-to-optimal performances with negligible overhead and use minimal number of resources. The end-to-end data transfer throughput is improved dramatically in existence of end-system bottlenecks compared to the non-optimized transfers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Arra/ani testbed. https://sites.google.com/a/lbl.gov/ani-100g-network. Accessed 30 July 2012
Fdt, fast data transfer. http://monalisa.cern.ch/FDT/. Accessed 30 July 2012
FutureGrid. http://www.futuregrid.org/. Accessed 30 July 2012
Super computing bandwidth challenge. http://sc09.supercomputing.org/?pg=bandwidth.html. Accessed 30 July 2012
Udt, udp-based data transfer. http://udt.sourceforge.net/. Accessed 30 July 2012
Ahuja, R., Magnanti, T., Orlin, J.: Network Flows. Prentice Hall (1993)
Allcock, W., Bresnahan, J., Kettimuth, R., Link, M., Dumirescu, C., Raicu, I., Foster, I.: The globus striped Gridftp framework and server. In: Proceedings of the ACM/IEEE Conference on Supercomputing, p. 54 (2005)
Altman, E., Barman, D., Tuffin, B., Vojnovic, M.: Parallel tcp sockets: simple model, throughput and validation. In: Proc. IEEE Conference on Computer Communications (INFOCOM’06), pp. 1–12 (2006)
Chase, J.S., Gallatin, A.J., Yocum, K.G.: End system optimizations for high-speed TCP. IEEE Commun. Mag. 39(4), 68–74 (2000)
Article Google Scholar
Crowcroft, J., Oechslin, P.: Differentiated end-to-end internet services using a weighted proportional fair sharing TCP. ACM SIGCOMM Comput. Commun. Rev. 28(3), 53–69 (1998)
Article Google Scholar
Floyd, S.: Rfc3649: highspeed TCP for large congestion windows
Gropp, W., Lusk, E., Thakur, R.: Using MPI-2: Advanced Features of the Message-Passing Interface. MIT Press (1999)
Hacker, T.J., Noble, B.D., Atley, B.D.: The end-to-end performance effects of parallel TCP sockets on a lossy wide area network. In: Proc. IEEE International Symposium on Parallel and Distributed Processing (IPDPS’02), pp. 434–443 (2002)
Hasegawa, G., Terai, T., Okamoto, T., Murata, M.: Scalable socket buffer tuning for high-performance web servers. In: International Conference on Network Protocols (ICNP01), p. 281 (2001)
The Lustre file system. http://wiki.lustre.org. Accessed 30 July 2012
Jain, M., Prasad, R.S., Davrolis, C.: The TCP bandwidth-delay product revisited: network buffering, cross traffic, and socket buffer auto-sizing. Tech. Rep., Georgia Institute of Technology (2003)
Jin, C., Wei, D.X., Low, S.H., Buhrmaster, G., Bunn, J., Choe, D.H., Cottrell, R.L.A., Doyle, J.C., Feng, W., Martin, O., Newman, H., Paganini, F., Ravot, S., Singh, S.: Fast TCP: from theory to experiments. IEEE Netw. 19(1), 4–11 (2005)
Article Google Scholar
Kola, G., Kosar, T., Livny, M.: Run-time adaptation of Grid data-placement jobs. SCPE 6(3), 33–43 (2005)
Google Scholar
LBNL: The distributed parallel storage system. http://www-didc.lbl.gov/DPSS. Accessed 30 July 2012
Liu, W., Tieman, B., Kettimuthu, R., Foster, I.: A data transfer framework for large-scale science experiments. In: Proc. 19th ACM International Symposium on High-Performance Distributed Computing (HPDC’10) (2010)
Lu, D., Qiao, Y., Dinda, P.A., Bustamante, F.E.: Modeling and taming parallel TCP on the wide area network. In: Proc. IEEE International Symposium on Parallel and Distributed Processing (IPDPS’05), p. 68b (2005)
Prasad, R.S., Jain, M., Davrolis, C.: Socket buffer auto-sizing for high-performance data transfers. J. Grid Computing 1(4), 361–376 (2004)
Article Google Scholar
Pucha, H., Kaminsky, M., Andersen, D.G., Kozuch, M.A.: Adaptive file transfers for diverse environments. In: Proceedings of USENIX’08 (2008)
Schmuck, F., Haskin, R.: Gpfs: a shared-disk file system for large computing clusters. In: Proceedings of the 1st Usenix Conference on File and Storage (FAST’02) (2002)
Semke, J., Madhavi, J., Mathis, M.: Automatic tcp buffer tuning. In: ACM SIGCOMM’98, vol. 28(4), pp. 315–323 (1998)
Stone, N., Gill, B., Kochmar, J., Light, R., Nowoczynski, P., Scott, J.R., Sommerfield, J., Vizino, C.: Dmover: parallel data migration for mainstream users. Tech. Rep., Pittsburgh Supercomputing Center (2010)
Thomas, M.: Ultralight planets tutorial (2008)
Weigle, E., Feng, W.: Dynamic right-sizing: a simulation study. In: Proc. IEEE International Conference on Computer Communications and Networks (ICCCN’01) (2001)
Louisiana optical network initiative. http://www.loni.org. Accessed 30 July 2012
The TeraGrid. http://www.teragrid.org. Accessed 30 July 2012
Yildirim, E., Yin, D., Kosar, T.: Prediction of optimal parallelism level in wide area data transfers. IEEE Trans. Parallel Distrib. Syst. 22(12), 2033–2045 (2011)
Article Google Scholar

Download references

Author information

Authors and Affiliations

State University of New York, 338H Davis Hall, Buffalo, NY, USA
Esma Yildirim
State University of New York, 338J Bell Hall, Buffalo, NY, USA
Tevfik Kosar

Authors

Esma Yildirim
View author publications
You can also search for this author in PubMed Google Scholar
Tevfik Kosar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Esma Yildirim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yildirim, E., Kosar, T. End-to-End Data-Flow Parallelism for Throughput Optimization in High-Speed Networks. J Grid Computing 10, 395–418 (2012). https://doi.org/10.1007/s10723-012-9220-9

Download citation

Received: 30 September 2011
Accepted: 05 July 2012
Published: 10 August 2012
Issue Date: September 2012
DOI: https://doi.org/10.1007/s10723-012-9220-9

End-to-End Data-Flow Parallelism for Throughput Optimization in High-Speed Networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Joint Operator Scaling and Placement for Distributed Stream Processing Applications in Edge Computing

Toward optimal operator parallelism for stream processing topology with limited buffers

Coflow scheduling and placement for packet-switched optical datacenter networks

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

End-to-End Data-Flow Parallelism for Throughput Optimization in High-Speed Networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Joint Operator Scaling and Placement for Distributed Stream Processing Applications in Edge Computing

Toward optimal operator parallelism for stream processing topology with limited buffers

Coflow scheduling and placement for packet-switched optical datacenter networks

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation