Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Rethinking FTP: Aggressive block reordering for large file transfers

Published: 09 February 2009 Publication History

Abstract

Whole-file transfer is a basic primitive for Internet content dissemination. Content servers are increasingly limited by disk arm movement, given the rapid growth in disk density, disk transfer rates, server network bandwidth, and content size. Individual file transfers are sequential, but the block access sequence on a content server is effectively random when many slow clients access large files concurrently. Although larger blocks can help improve disk throughput, buffering requirements increase linearly with block size.
This article explores a novel block reordering technique that can reduce server disk traffic significantly when large content files are shared. The idea is to transfer blocks to each client in any order that is convenient for the server. The server sends blocks to each client opportunistically in order to maximize the advantage from the disk reads it issues to serve other clients accessing the same file. We first illustrate the motivation and potential impact of aggressive block reordering using simple analytical models. Then we describe a file transfer system using a simple block reordering algorithm, called Circus. Experimental results with the Circus prototype show that it can improve server throughput by a factor of two or more in workloads with strong file access locality.

References

[1]
Acharya, S., Franklin, M., and Zdonik, S. 1997. Balancing push and pull for data broadcast. In Proceedings of the ACM SIGMOD, 183--194.
[2]
Allcock, B., Bester, J., Bresnahan, J., Chervenak, A. L., Foster, I., Kesselman, C., Meder, S., Nefedova, V., Quesnal, D., and Tuecke, S. 2002. Data management and transfer in high performance computational grid environments. Parallel Comput. J. 28, 5, 749--771.
[3]
Almeida, J. M., Krueger, J., Eager, D. L., and Vernon, M. K. 2001. Analysis of educational media server workloads. In Proceedings of the International Workshop on Network and Operating System Support for Digital Audio and Video, 21--30.
[4]
Anastasiadis, S. V., Sevcik, K. C., and Stumm, M. 2001. Modular and efficient resource management in the exedra media server. In Proceedings of the USENIX Symposium on Internet Technologies and Systems, 25--36.
[5]
Anastasiadis, S. V., Wickremesinghe, R. G., and Chase, J. S. 2004. Circus: Opportunistic block reordering for scalable content servers. In Proceedings of the USENIX Conference on File and Storage Technologies, 201--212.
[6]
Arlitt, M. F. and Williamson, C. L. 1996. Web server workload characterization: The search for invariants. In Proceedings of the ACM SIGMETRICS, 126--137.
[7]
Baker, M. G., Hartman, J. H., Kupfer, M. D., Shirriff, K. W., and Ousterhout, J. K. 1991. Measurements of a distributed file system. In Proceedings of the ACM Symposium on Operating Systems Principles, 198--212.
[8]
Barford, P. and Crovella, M. 1998. Generating representative Web workloads for network and server performance evaluation. In Proceedings of the ACM SIGMETRICS, 151--160.
[9]
Brown, A. D., Mowry, T. C., and Krieger, O. 2001. Compiler-Based I/O prefetching for out-of-core applications. ACM Trans. Comput. Syst. 19, 2, 111--170.
[10]
Byers, J., Considine, J., Mitzenmacher, M., and Rost, S. 2002. Informed content delivery across adaptive overlay networks. In Proceedings of the ACM SIGCOMM, 47--60.
[11]
Byers, J. W., Luby, M., Mitzenmacher, M., and Rege, A. 1998. A digital fountain approach to reliable distribution of bulk data. In Proceedings of the ACM SIGCOMM, 57--67.
[12]
Cao, P., Felten, E. W., Karlin, A., and Li, K. 1995. A study of integrated prefetching and caching strategies. In Proceedings of the SIGMETRICS/Peformance'95.
[13]
Chesire, M., Wolman, A., Voelker, G. M., and Levy, H. M. 2001. Measurement and analysis of a streaming-media workload. In Proceedings of the USENIX Symposium on Internet Technologies and Systems, 1--12.
[14]
Clark, D. D. and Tennenhouse, D. L. 1990. Architectural considerations for a new generation of protocols. In Proceedings of the ACM SIGCOMM, 200--208.
[15]
Coffman, K. and Odlyzko, A. M. 2002. Internet growth: Is there a “moore's law” for data traffic? In Proceedings of the Handbook of Massive Data Sets. Kluwer Academic, 47--93.
[16]
Cohen, B. 2003. Incentives build robustness in bittorrent. bitconjurer.org.
[17]
Diot, C. and Gagnon, F. 1999. Impact of out-of-sequence processing on the performance of data transmission. Comput. Netw. 31, 475--492.
[18]
Doyle, R. P., Chase, J. S., Gadde, S., and Vahdat, A. M. 2001. The trickle-down effect: Web caching and server request distribut ion. In Proceedings of the International Workshop on Web Caching and Content Delivery.
[19]
Eager, D., Vernon, M., and Zahorjan, J. 2001. Minimizing bandwidth requirements for on-demand data delivery. IEEE Trans. Knowl. Data Eng. 13, 5, 742--757.
[20]
Garey, M. R. and Johnson, D. S. 1979. Computers and Intractability. Freeman, New York.
[21]
Jin, S. and Bestavros, A. 2002. Scalability of multicast delivery for non-sequential streaming access. In Proceedings of the ACM SIGMETRICS, 97--107.
[22]
Luby, M. 2002. Lt codes. In Proceedings of the IEEE Symposium on Foundations of Computer Science, 271--282.
[23]
Megiddo, N. and Modha, D. S. 2003. Arc: A self-tuning, low overhead replacement cache. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST'03).
[24]
Padhye, J., Firoiu, V., Towsley, D. F., and Kurose, J. F. 2000. Modeling TCP Reno performance: A simple model and its empirical validation. IEEE/ACM Trans. Netw. 8, 2, 133--145.
[25]
Pai, V. S., Aron, M., Banga, G., Svendsen, M., Druschel, P., Zwaenepoel, W., and Nahum, E. 1998. Locality-Aware request distribution in cluster-based network servers. In Proceedings of the ACM ASPLOS, 205--216.
[26]
Park, K. and Pai, V. S. 2006. Scale and performance in the Coblitz large-file distribution service. In Proceedings of the USENIX Symposium on Networked Systems Design & Implementation, 29--44.
[27]
Patterson, R. H., Gibson, G. A., Ginting, E., Stodolsky, D., and Zelenka, J. 1995. Informed prefetching and caching. In Proceedings of the ACM Symposium on Operating Systems Principles, 79--95.
[28]
Postel, J. and Reynolds, J. 1985. File transfer protocol (ftp). USC/ISI, Network Working Group RFC 959.
[29]
Raman, S., Balakrishnan, H., and Srinivasan, M. 2000. An image transport protocol for the internet. In Proceedings of the International Conference on Network Protocols, 209--219.
[30]
Rizzo, L. 1997. Dummynet: A simple approach to the evaluation of network protocol. ACM Commun. Rev. 47, 1, 31--41.
[31]
Rost, S., Byers, J., and Bestavros, A. 2001. The cyclone server architecture: Streamlining delivery of popular content. In Proceedings of the International Workshop on Web Caching and Content Distribution. Boston, MA.
[32]
Saroiu, S., Gummadi, P. K., Dunn, R. J., Gribble, S. D., and Levy, H. M. 2002. An analysis of internet content delivery systems. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation, 315--328.
[33]
Saroiu, S., Gummadi, P. K., and Gribble, S. D. 2002. A measurement study of peer-to-peer file sharing systems. In Proceedings of the SPIE/ACM Multimedia Computing and Networking Conference.
[34]
Steere, D. C. 1997. Exploiting the non-determinism and asynchrony of set iterators to reduce aggregate file I/O latency. In Proceedings of the ACM Symposium on Operating Systems Principles, 252--263.
[35]
Trivedi, K. S. 1982. Probability and Statistics with Reliability, Queuing and Computer Science Applications. Prentice-Hall, Englewood Cliffs, NJ.
[36]
Vitter, J. S. and Krishnan, P. 1996. Optimal prefetching via data compression. J. ACM 43, 5, 771--793.
[37]
Vogels, W. 1999. File system usage in windows nt 4.0. In Proceedings of the ACM Symposium on Operating Systems Principles, 93--109.
[38]
Wang, L., Pai, V. S., and Peterson, L. L. 2002. The effectiveness of request redirection on CDN robustness. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation, 345--360.
[39]
Zhang, Y., Breslau, L., Paxson, V., and Shenker, S. 2002. On the characteristics and origins of internet flow rates. In Proceedings of the ACM SIGCOMM.

Cited By

View all

Index Terms

  1. Rethinking FTP: Aggressive block reordering for large file transfers

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Storage
      ACM Transactions on Storage  Volume 4, Issue 4
      January 2009
      116 pages
      ISSN:1553-3077
      EISSN:1553-3093
      DOI:10.1145/1480439
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 09 February 2009
      Accepted: 01 April 2008
      Revised: 01 April 2008
      Received: 01 December 2007
      Published in TOS Volume 4, Issue 4

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Disk access
      2. file transfer protocols
      3. scheduling

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)8
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 18 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2019)Preventing Silent Data Corruptions from Propagating During Data ReconstructionIEEE Transactions on Computers10.1109/TC.2010.3659:12(1611-1624)Online publication date: 4-Jan-2019
      • (2019)DACOIEEE Transactions on Computers10.1109/TC.2010.2259:10(1350-1362)Online publication date: 3-Jan-2019
      • (2014)File system and storage array design challenges for flash memory2014 International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE)10.1109/ICGCCEE.2014.6922453(1-8)Online publication date: Mar-2014
      • (2014)CloudJet4BigDataProceedings of the 2014 IEEE International Congress on Big Data10.1109/BigData.Congress.2014.93(608-615)Online publication date: 27-Jun-2014
      • (2013)A Flexible Framework to Enhance RAID-6 Scalability via Exploiting the Similarities among MDS CodesProceedings of the 2013 42nd International Conference on Parallel Processing10.1109/ICPP.2013.68(542-551)Online publication date: 1-Oct-2013
      • (2013)SupersetProceedings of the 2013 International Conference on Advanced Cloud and Big Data10.1109/CBD.2013.34(139-146)Online publication date: 13-Dec-2013
      • (2013)On endurance of erasure codes in SSD-based storage systemsThe 17th CSI International Symposium on Computer Architecture & Digital Systems (CADS 2013)10.1109/CADS.2013.6714239(67-72)Online publication date: Oct-2013
      • (2013)CORE: Cross-object redundancy for efficient data repair in storage systems2013 IEEE International Conference on Big Data10.1109/BigData.2013.6691581(246-254)Online publication date: Oct-2013
      • (2012)Improving write performance by enhancing internal parallelism of Solid State Drives2012 IEEE 31st International Performance Computing and Communications Conference (IPCCC)10.1109/PCCC.2012.6407767(266-274)Online publication date: Dec-2012
      • (2012)Shifted Element Arrangement in Mirror Disk Arrays for High Data Availability during ReconstructionProceedings of the 2012 41st International Conference on Parallel Processing10.1109/ICPP.2012.53(178-188)Online publication date: 10-Sep-2012
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media