Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2907294.2907319acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article

IBIS: Interposed Big-data I/O Scheduler

Published: 31 May 2016 Publication History

Abstract

Big-data systems are increasingly shared by diverse, data-intensive applications from different domains. However, existing systems lack the support for I/O management, and the performance of big-data applications degrades in unpredictable ways when they contend for I/Os. To address this challenge, this paper proposes IBIS, an Interposed Big-data I/O Scheduler, to provide I/O performance differentiation for competing applications in a shared big-data system. IBIS transparently intercepts, isolates, and schedules an application's different phases of I/Os via an I/O interposition layer on every datanode of the big-data system. It provides a new proportional-share I/O scheduler, SFQ(D2), to allow applications to share the I/O service of each datanode with good fairness and resource utilization. It enables the distributed I/O schedulers to coordinate with one another and to achieve proportional sharing of the big-data system's total I/O service in a scalable manner. Finally, it supports the shared use of big-data resources by diverse frameworks and manages the I/Os from different types of big-data workloads (e.g., batch jobs vs. queries) across these frameworks. The prototype of IBIS is implemented in Hadoop/YARN, a widely used big-data system. Experiments based on a variety of representative applications (WordCount, TeraSort, Facebook, TPC-H) show that IBIS achieves good total-service proportional sharing with low overhead in both application performance and resource usages. IBIS is also shown to support various performance policies: it can deliver stronger performance isolation than native Hadoop/YARN (99% better for WordCount and 15% better for TPC-H queries) with good resource utilization; and it can also achieve perfect proportional slowdown with better application performance (30% better than native Hadoop).

References

[1]
Apache Hadoop. http://hadoop.apache.org/.
[2]
Hadoop Fair Scheduler. https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/FairScheduler.html.
[3]
HBase. http://hbase.apache.org.
[4]
Linux containers. https://linuxcontainers.org.
[5]
OpenFlow. https://www.opennetworking.org/sdn-resources/openflow.
[6]
Statistical workload injector for MapReduce (SWIM). https://github.com/SWIMProjectUCB/SWIM/wiki.
[7]
TPC-H Benchmark Specification. http://www.tpc.org/tpch.
[8]
G. Ananthanarayanan, A. Ghodsi, A. Wang, D. Borthakur, S. Kandula, S. Shenker, and I. Stoica. PACMan: Coordinated memory caching for parallel jobs. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, 2012.
[9]
J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th Conference on Symposium on Opearting Systems Design and Implementation (OSDI'04), Berkeley, CA, USA, 2004. USENIX Association.
[10]
S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google file system. In Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles (SOSP'03), pages 29--43, New York, NY, USA, 2003. ACM.
[11]
Y. Guo, J. Rao, and X. Zhou. iShuffle: Improving Hadoop performance with shuffle-on-write. In Proceedings of the 10th International Conference on Autonomic Computing (ICAC'13), pages 107--117, San Jose, CA, 2013. USENIX.
[12]
B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, 2011.
[13]
W. Jin, J. S. Chase, and J. Kaur. Interposed proportional sharing for a storage service utility. In Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'04), pages 37--48, New York, NY, USA, 2004. ACM.
[14]
A. Povzner, D. Sawyer, and S. Brandt. Horizon: Efficient deadline-driven disk I/O management for distributed storage systems. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing (HPDC'10), pages 1--12, New York, NY, USA, 2010. ACM.
[15]
D. Shue, M. J. Freedman, and A. Shaikh. Performance isolation and fairness for multi-tenant cloud storage. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, pages 349--362, 2012.
[16]
K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The Hadoop distributed file system. In Proceedings of the IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pages 1--10. IEEE, 2010.
[17]
A. Thusoo, J. Sarma, N. Jain, Z. Shao, P. Chakka, N. Zhang, S. Antony, H. Liu, and R. Murthy. Hive - A petabyte scale data warehouse using Hadoop. In Proceedings of the 26th IEEE International Conference on Data Engineering (ICDE'10), pages 996--1005, March 2010.
[18]
V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, and S. Seth. Apache Hadoop YARN: Yet another resource negotiator. In Proceedings of the Fourth ACM Symposium on Cloud Computing, 2013.
[19]
A. Wang, S. Venkataraman, S. Alspaugh, R. Katz, and I. Stoica. Cake: Enabling high-level SLOs on shared storage systems. In Proceedings of the Third ACM Symposium on Cloud Computing (SOCC'12), pages 14:1--14:14, New York, NY, USA, 2012. ACM.
[20]
A. Wang, S. Venkataraman, S. Alspaugh, I. Stoica, and R. Katz. Sweet storage SLOs with Frosting. In Proceedings of the 4th USENIX Conference on Hot Topics in Cloud Computing (HotCloud'12), Berkeley, CA, USA, 2012. USENIX Association.
[21]
Y. Wang and A. Merchant. Proportional-share scheduling for distributed storage systems. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST'07), Berkeley, CA, USA, 2007. USENIX.
[22]
Y. Xu, D. Arteaga, M. Zhao, Y. Liu, R. Figueiredo, and S. Seelam. vPFS: Virtualization-based bandwidth management for parallel storage systems. In Proceedings of the 28th IEEE Conference on Massive Data Storage (MSST), April 2012.
[23]
Y. Xu, A. Suarez, and M. Zhao. IBIS: Interposed big-data I/O scheduler. In Proceedings of the 22nd International Symposium on High-performance Parallel and Distributed Computing (HPDC'13), pages 109--110, New York, NY, USA, 2013. ACM.
[24]
J. Zhang, A. Sivasubramaniam, Q. Wang, A. Riska, and E. Riedel. Storage performance virtualization via throughput and latency control. ACM Transactions on Storage, 2(3):283--308, Aug. 2006.
[25]
M. Zhao and R. J. Figueiredo. Application-tailored cache consistency for wide-area file systems. In Proceedings of the 26th IEEE International Conference on Distributed Computing Systems (ICDCS'06), 2006.

Cited By

View all
  • (2021)Achieving Fairness-Aware Two-Level Scheduling for Heterogeneous Distributed SystemsIEEE Transactions on Services Computing10.1109/TSC.2018.283644414:3(639-653)Online publication date: 1-May-2021
  • (2021)Enhancing Proportional IO Sharing on Containerized Big Data File SystemsIEEE Transactions on Computers10.1109/TC.2020.3037078(1-1)Online publication date: 2021
  • (2020)Provisioning Input and Output Data Rates in Data Processing FrameworksJournal of Grid Computing10.1007/s10723-020-09508-0Online publication date: 3-Mar-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
HPDC '16: Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing
May 2016
302 pages
ISBN:9781450343145
DOI:10.1145/2907294
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. big data
  2. i/o scheduler
  3. storage management

Qualifiers

  • Research-article

Funding Sources

  • Department of Defense award
  • National Science Foundation CAREER award

Conference

HPDC'16
Sponsor:

Acceptance Rates

HPDC '16 Paper Acceptance Rate 20 of 129 submissions, 16%;
Overall Acceptance Rate 166 of 966 submissions, 17%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)1
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Achieving Fairness-Aware Two-Level Scheduling for Heterogeneous Distributed SystemsIEEE Transactions on Services Computing10.1109/TSC.2018.283644414:3(639-653)Online publication date: 1-May-2021
  • (2021)Enhancing Proportional IO Sharing on Containerized Big Data File SystemsIEEE Transactions on Computers10.1109/TC.2020.3037078(1-1)Online publication date: 2021
  • (2020)Provisioning Input and Output Data Rates in Data Processing FrameworksJournal of Grid Computing10.1007/s10723-020-09508-0Online publication date: 3-Mar-2020
  • (2019)Emulation of Storage Performance in Testbed Experiments with DistemIEEE INFOCOM 2019 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)10.1109/INFCOMW.2019.8845155(805-810)Online publication date: Apr-2019
  • (2018)Principled schedulability analysis for distributed storage systems using thread architecture modelsProceedings of the 13th USENIX conference on Operating Systems Design and Implementation10.5555/3291168.3291181(161-176)Online publication date: 8-Oct-2018
  • (2018)CLIBE: Precise Cluster-Level I/O Bandwidth Enforcement in Distributed File System2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)10.1109/HPCC/SmartCity/DSS.2018.00048(124-131)Online publication date: Jun-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media