Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3225058.3225099acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

Joint Optimization of MapReduce Scheduling and Network Policy in Hierarchical Clouds

Published: 13 August 2018 Publication History

Abstract

As MapReduce is becoming increasingly popular in large-scale data analysis, there is a growing need for moving MapReduce into multi-tenant clouds. However, there is an important challenge that the performance of MapReduce applications can be significantly influenced by the time-varying network bandwidth in a shared cluster. Although a few recent studies improve MapReduce performance by dynamic scheduling to reduce the shuffle traffic, most of them do not consider the impact by widely existing hierarchical network architectures in data centers. In this paper, we propose and design a Hierarchical topology (Hit) aware MapReduce scheduler to minimize overall data traffic cost and hence to reduce job execution time. We first formulate the problem as a Topology Aware Assignment (TAA) optimization problem while considering dynamic computing and communication resources in the cloud with hierarchical network architecture. We further develop a synergistic strategy to solve the TAA problem by using the stable matching theory, which ensures the preference of both individual tasks and hosting machines. Finally, we implement the proposed scheduler as a pluggable module on Hadoop YARN and evaluate its performance by testbed experiments and simulations. The experimental results show Hit-scheduler can improve job completion time by 28% and 11% compared to Capacity Scheduler and Probabilistic Network-Aware scheduler, respectively. Our simulations further demonstrate that Hit-scheduler can gain the traffic cost by 38% at most and improve the average shuffle flow traffic time by 32% compared to Capacity scheduler.

References

[1]
Faraz Ahmad, Srimat T Chakradhar, Anand Raghunathan, and TN Vijaykumar. 2012. Tarazu: optimizing mapreduce on heterogeneous clusters. In Proc. of ACM SIGARCH.
[2]
Faraz Ahmad, Srimat T Chakradhar, Anand Raghunathan, and TN Vijaykumar. 2014. ShuffleWatcher: Shuffle-aware Scheduling in Multi-tenant MapReduce Clusters. In Proc. of USENIX ATC.
[3]
Faraz Ahmad, Seyong Lee, Mithuna Thottethodi, and TN Vijaykumar. 2012. Puma: Purdue mapreduce benchmarks suite. (2012).
[4]
Alessio Botta, Alberto Dainotti, and Antonio Pescapè. 2012. A tool for the generation of realistic network workload for emerging networking scenarios. Computer Networks 56, 15 (2012), 3531--3547.
[5]
Wei Chen, Jia Rao, and Xiaobo Zhou. 2017. Preemptive, low latency datacenter scheduling via lightweight virtualization. In Proc. of ATC. USENIX.
[6]
Dazhao Cheng, Yuan Chen, Xiaobo Zhou, Daniel Gmach, and Dejan Milojicic. 2017. Adaptive scheduling of parallel jobs in spark streaming. In Proc. of INFOCOM. IEEE.
[7]
Mosharaf Chowdhury, Srikanth Kandula, and Ion Stoica. 2013. Leveraging end-point flexibility in data-intensive clusters. In Proc. of ACM SIGCOMM.
[8]
Rogério Leao Santos De Oliveira, Ailton Akira Shinoda, Christiane Marie Schweitzer, and Ligia Rodrigues Prete. 2014. Using mininet for emulation and prototyping software-defined networks. In Proc. of IEEE COLCOM.
[9]
Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: simplified data processing on large clusters. Commun. ACM 51, 1 (2008), 107--113.
[10]
Seyed Kaveh Fayazbakhsh, Luis Chiang, Vyas Sekar, Minlan Yu, and Jeffrey C Mogul. 2014. Enforcing Network-Wide Policies in the Presence of Dynamic Middlebox Actions using FlowTags. In Proc. of USENIX NSDI.
[11]
David Gale and Lloyd S Shapley. 1962. College admissions and the stability of marriage. The American Mathematical Monthly 69, 1 (1962), 9--15.
[12]
Albert Greenberg, James R Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David A Maltz, Parveen Patel, and Sudipta Sengupta. 2009. VL2: a scalable and flexible data center network. In Proc. of ACM SIGCOMM.
[13]
Chuanxiong Guo, Guohan Lu, Dan Li, Haitao Wu, Xuan Zhang, Yunfeng Shi, Chen Tian, Yongguang Zhang, and Songwu Lu. 2009. BCube: a high performance, server-centric network architecture for modular data centers. In ACM SIGCOMM Computer Communication Review (2009).
[14]
Yanfei Guo, Jia Rao, Dazhao Cheng, and Xiaobo Zhou. 2017. ishuffle: Improving hadoop performance with shuffle-on-write. In IEEE Transactions on Parallel and Distributed Systems 28, 6 (2017), 1649--1662.
[15]
László Gyarmati and Tuan Anh Trinh. 2010. Scafida: A scale-free network inspired data center architecture. Proc. of ACM SIGCOMM (2010).
[16]
Americas Headquarters. 2007. Cisco Data Center Infrastructure 2.5 Design Guide. In Cisco Validated Design I. Cisco Systems, Inc.
[17]
Michael Isard, Vijayan Prabhakaran, Jon Currey, Udi Wieder, Kunal Talwar, and Andrew Goldberg. 2009. Quincy: fair scheduling for distributed computing clusters. In Proc. of ACM SIGOPS.
[18]
Virajith Jalaparti, Peter Bodik, Ishai Menache, Sriram Rao, Konstantin Makarychev, and Matthew Caesar. 2015. Network-aware scheduling for data-parallel jobs:Plan when you can. Proc. of ACM SIGCOMM (2015).
[19]
Hans Kellerer, Ulrich Pferschy, and David Pisinger. 2004. Introduction to NP-Completeness of knapsack problems. In Knapsack problems. Springer, 483--493.
[20]
Charles E Leiserson. 1985. Fat-trees: universal networks for hardware-efficient supercomputing. IEEE transactions on Computers 100, 10 (1985), 892--901.
[21]
Min Li, Dinesh Subhraveti, Ali R Butt, Aleksandr Khasymski, and Prasenjit Sarkar. 2012. CAM: a topology aware minimum cost flow based resource manager for MapReduce applications in the cloud. In Proc. of ACM HPDC.
[22]
Xiaoqiao Meng, Vasileios Pappas, and Li Zhang. 2010. Improving the scalability of data center networks with traffic-aware virtual machine placement. In Proc. of IEEE INFOCOM.
[23]
Radhika Niranjan Mysore, Andreas Pamboris, Nathan Farrington, Nelson Huang, Pardis Miri, Sivasankar Radhakrishnan, Vikram Subramanya, and Amin Vahdat. 2009. Portland: a scalable fault-tolerant layer 2 data center network fabric. In Proc. of ACM SIGCOMM.
[24]
Balaji Palanisamy, Aameek Singh, Ling Liu, and Bhushan Jain. 2011. Purlieus: locality-aware resource allocation for MapReduce in a cloud. In Proc. of IEEE/ACM SC.
[25]
Zafar Ayyub Qazi, Cheng-Chun Tu, Luis Chiang, Rui Miao, Vyas Sekar, and Minlan Yu. 2013. SIMPLE-fying middlebox policy enforcement using SDN. In Proc. of ACM SIGCOMM.
[26]
Haiying Shen, Ankur Sarker, Lei Yu, and Feng Deng. 2016. Probabilistic network-aware task placement for mapreduce scheduling. In Proc. of IEEE CLUSTER.
[27]
Liang Tong, Yong Li, and Wei Gao. 2016. A hierarchical edge cloud architecture for mobile computing. In Proc. of IEEE INFOCOM.
[28]
Abhishek Verma, Brian Cho, Nicolas Zea, Indranil Gupta, and Roy H Campbell. 2013. Breaking the MapReduce stage barrier. Cluster computing 16, 1 (2013), 191--206.
[29]
Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma, Khaled Elmeleegy, Scott Shenker, and Ion Stoica. 2010. Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In Proc. of ACM Eurosys.
[30]
Matei Zaharia, Andy Konwinski, Anthony D Joseph, Randy H Katz, and Ion Stoica. 2008. Improving MapReduce performance in heterogeneous environments. In Proc. of USENIX OSDI.

Cited By

View all
  • (2023)Dynamic Resource Provisioning for Iterative Workloads on Apache SparkIEEE Transactions on Cloud Computing10.1109/TCC.2021.310804311:1(639-652)Online publication date: 1-Jan-2023
  • (2022)Joint Optimization of MapReduce Scheduling and Network Policy in Hierarchical Data CentersIEEE Transactions on Cloud Computing10.1109/TCC.2019.296165310:1(461-473)Online publication date: 1-Jan-2022
  • (2020)Data Life Aware Model Updating Strategy for Stream-based Online Deep Learning2020 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER49012.2020.00049(392-398)Online publication date: Sep-2020
  • Show More Cited By

Index Terms

  1. Joint Optimization of MapReduce Scheduling and Network Policy in Hierarchical Clouds

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICPP '18: Proceedings of the 47th International Conference on Parallel Processing
    August 2018
    945 pages
    ISBN:9781450365109
    DOI:10.1145/3225058
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • University of Oregon: University of Oregon

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 August 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Hierarchical Clouds
    2. Joint Optimization
    3. MapReduce Scheduling
    4. Network Policy
    5. Topology Aware Assignment

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICPP 2018

    Acceptance Rates

    ICPP '18 Paper Acceptance Rate 91 of 313 submissions, 29%;
    Overall Acceptance Rate 91 of 313 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 29 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Dynamic Resource Provisioning for Iterative Workloads on Apache SparkIEEE Transactions on Cloud Computing10.1109/TCC.2021.310804311:1(639-652)Online publication date: 1-Jan-2023
    • (2022)Joint Optimization of MapReduce Scheduling and Network Policy in Hierarchical Data CentersIEEE Transactions on Cloud Computing10.1109/TCC.2019.296165310:1(461-473)Online publication date: 1-Jan-2022
    • (2020)Data Life Aware Model Updating Strategy for Stream-based Online Deep Learning2020 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER49012.2020.00049(392-398)Online publication date: Sep-2020
    • (2019)A Network-aware and Partition-based Resource Management Scheme for Data Stream ProcessingProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337870(1-10)Online publication date: 5-Aug-2019
    • (2019)Delay-Optimal Traffic Engineering through Multi-agent Reinforcement LearningIEEE INFOCOM 2019 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)10.1109/INFCOMW.2019.8845154(435-442)Online publication date: Apr-2019

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media