Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3357223.3362710acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Characterizing and Synthesizing Task Dependencies of Data-Parallel Jobs in Alibaba Cloud

Published: 20 November 2019 Publication History

Abstract

Cluster schedulers routinely face data-parallel jobs with complex task dependencies expressed as DAGs (directed acyclic graphs). Understanding DAG structures and runtime characteristics in large production clusters hence plays a key role in scheduler design, which, however, remains an important missing piece in the literature. In this work, we present a comprehensive study of a recently released cluster trace in Alibaba. We examine the dependency structures of Alibaba jobs and find that their DAGs have sparsely connected vertices and can be approximately decomposed into multiple trees with bounded depth. We also characterize the runtime performance of DAGs and show that dependent tasks may have significant variability in resource usage and duration---even for recurring tasks. In both aspects, we compare the query jobs in the standard TPC benchmarks with the production DAGs and find the former inadequately representative. To better benchmark DAG schedulers at scale, we develop a workload generator that can faithfully synthesize task dependencies based on the production Alibaba trace. Extensive evaluations show that the synthesized DAGs have consistent statistical characteristics as the production DAGs, and the synthesized and real workloads yield similar scheduling results with various schedulers.

References

[1]
Martín Abadi, Paul Barham, et al. 2016. Tensorflow: a system for large-scale machine learning. In OSDI.
[2]
Sameer Agarwal, Srikanth Kandula, Nico Bruno, Ming-Chuan Wu, Ion Stoica, and Jingren Zhou. 2012. Reoptimizing data parallel computing. In NSDI.
[3]
Alibaba. 2019. Alibaba Cluster Trace Program. https://bit.ly/2K8DWCa
[4]
George Amvrosiadis, Jun Woo Park, Gregory R Ganger, Garth A Gibson, Elisabeth Baseman, and Nathan DeBardeleben. 2018. On the diversity of cluster workloads and its impact on research results. In ATC.
[5]
Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. 2013. Effective straggler mitigation: Attack of the clones. In NSDI.
[6]
Michael Armbrust, Reynold S Xin, Cheng Lian, et al. 2015. Spark sql: Relational data processing in spark. In SIGMOD.
[7]
Laurent Bindschaedler, Jasmina Malicevic, et al. 2018. Rock You Like a Hurricane: Taming Skew in Large Scale Analytics. In EuroSys.
[8]
Paris Carbone, Asterios Katsifodimos, Stephan Ewen, Volker Markl, Seif Haridi, and Kostas Tzoumas. 2015. Apache flink: Stream and batch processing in a single engine. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering (2015).
[9]
Chen Chen, Wei Wang, and Bo Li. 2018. Performance-Aware Fair Scheduling: Exploiting Demand Elasticity of Data Analytics Jobs. In Proc. IEEE INFOCOM.
[10]
Chen Chen, Wei Wang, Shengkai Zhang, and Bo Li. 2017. Cluster fair queueing: Speeding up data-parallel jobs with delay guarantees. In IEEE Conference on Computer Communications (INFOCOM).
[11]
Tianqi Chen, Mu Li, et al. 2015. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. Neural Information Processing Systems, Workshop on Machine Learning Systems (2015).
[12]
Mosharaf Chowdhury and Ion Stoica. 2012. Coflow: A networking abstraction for cluster applications. In HotNets.
[13]
Mosharaf Chowdhury and Ion Stoica. 2015. Efficient coflow scheduling without prior knowledge. In SIGCOMM.
[14]
Eli Cortez, Anand Bonde, Alexandre Muzio, et al. 2017. Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms. In SOSP.
[15]
Carlo Curino, Djellel E Difallah, Chris Douglas, et al. 2014. Reservation-based scheduling: If you're late don't blame us!. In SoCC.
[16]
Pamela Delgado, Diego Didona, Florin Dinu, and Willy Zwaenepoel. 2018. Kairos: Preemptive data center scheduling without runtime estimates. In SoCC.
[17]
Haiyang Ding. 2019. Is there some problem with the time_stamp in v2018?? https://github.com/alibaba/clusterdata/issues/52
[18]
Haiyang Ding. 2019. Private Communication. Online Meeting.
[19]
Haiyang Ding. 2019. Question about CPU allocation on containerized online service. https://github.com/alibaba/clusterdata/issues/19
[20]
Haiyang Ding. 2019. Question Regarding Normalized Memory Usage. https://github.com/alibaba/clusterdata/issues/61
[21]
Andrew D Ferguson, Peter Bodik, Srikanth Kandula, Eric Boutin, and Rodrigo Fonseca. 2012. Jockey: guaranteed job latency in data parallel clusters. In EuroSys.
[22]
Emden R Gansner, Eleftherios Koutsofios, Stephen C North, and K-P Vo. 1993. A technique for drawing directed graphs. IEEE Transactions on Software Engineering (1993).
[23]
Panagiotis Garefalakis, Konstantinos Karanasos, Peter Pietzuch, Arun Suresh, and Sriram Rao. 2018. Medea: scheduling of long running applications in shared production clusters. In EuroSys.
[24]
Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, and Ion Stoica. 2011. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types. In NSDI.
[25]
Ionel Gog, Malte Schwarzkopf, Adam Gleave, Robert NM Watson, and Steven Hand. 2016. Firmament: Fast, centralized cluster scheduling at scale. In OSDI.
[26]
Robert Grandl, Ganesh Ananthanarayanan, Srikanth Kandula, Sriram Rao, and Aditya Akella. 2015. Multi-resource packing for cluster schedulers. (2015).
[27]
Robert Grandl, Mosharaf Chowdhury, Aditya Akella, and Ganesh Ananthanarayanan. 2016. Altruistic Scheduling in Multi-Resource Clusters. In OSDI.
[28]
Robert Grandl, Srikanth Kandula, Sriram Rao, Aditya Akella, and Janardhan Kulkarni. 2016. Graphene: Packing and dependency-aware scheduling for data-parallel clusters. In OSDI.
[29]
Zhiming Hu, James Tu, and Baochun Li. 2019. Spear: Optimized Dependency-Aware Task Scheduling with Deep Reinforcement Learning. In Proc. IEEE ICDCS.
[30]
Chien-Chun Hung, Leana Golubchik, and Minlan Yu. 2015. Scheduling jobs across geo-distributed datacenters. In SoCC.
[31]
Michael Isard, Mihai Budiu, et al. 2007. Dryad: distributed data-parallel programs from sequential building blocks. In ACM SIGOPS operating systems review. ACM.
[32]
Michael Isard, Vijayan Prabhakaran, Jon Currey, et al. 2009. Quincy: fair scheduling for distributed computing clusters. In SOSP.
[33]
Sangeetha Abdu Jyothi, Carlo Curino, Ishai Menache, et al. 2016. Morpheus: Towards automated slos for enterprise clusters. In OSDI.
[34]
Jack Kuipers and Giusi Moffa. 2015. Uniform random generation of large acyclic digraphs. Statistics and Computing (2015).
[35]
Sanjeev Kulkarni, Nikunj Bhagat, et al. 2015. Twitter heron: Stream processing at scale. In SIGMOD.
[36]
Yu-Kwong Kwok and Ishfaq Ahmad. 1999. Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Computing Surveys (CSUR) (1999).
[37]
Qixiao Liu and Zhibin Yu. 2018. The Elasticity and Plasticity in Semi-Containerized Co-locating Cloud Workload: a View from Alibaba Trace. In SoCC.
[38]
Yang Liu, Huanle Xu, and Wing Cheong Lau. 2019. Online Job Scheduling with Resource Packing on a Cluster of Heterogeneous Servers. In Proc. IEEE INFOCOM.
[39]
Kshiteej Mahajan, Mosharaf Chowdhury, Aditya Akella, and Shuchi Chawla. 2018. Dynamic Query Re-Planning using QOOP. In OSDI.
[40]
Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, and Mohammad Alizadeh. 2018. Learning scheduling algorithms for data processing clusters. arXiv preprint arXiv:1810.01963 (2018).
[41]
Donald McAlister. 1879. The law of the geometric mean. Proceedings of the Royal Society of London (1879).
[42]
Subramanian Muralidhar, Wyatt Lloyd, Sabyasachi Roy, and other. 2014. f4: Facebook's Warm BLOB Storage System. In OSDI.
[43]
Derek G Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martín Abadi. 2013. Naiad: a timely dataflow system. In SOSP.
[44]
Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, and Andrew Tomkins. 2008. Pig latin: a not-so-foreign language for data processing. In SIGMOD.
[45]
Kay Ousterhout, Patrick Wendell, Matei Zaharia, and Ion Stoica. 2013. Sparrow: distributed, low latency scheduling. In SOSP.
[46]
Daniel Peng and Frank Dabek. 2010. Large-scale Incremental Processing Using Distributed Transactions and Notifications. In OSDI.
[47]
Meikel Poess and Chris Floyd. 2000. New TPC benchmarks for decision support and web commerce. ACM Sigmod Record (2000).
[48]
Meikel Poess, Bryan Smith, Lubor Kollar, and Paul Larson. 2002. Tpc-ds, taking decision support benchmarking to the next level. In SIGMOD.
[49]
Charles Reiss, Alexey Tumanov, Gregory R Ganger, Randy H Katz, and Michael A Kozuch. 2012. Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In SoCC.
[50]
Xiaoqi Ren, Ganesh Ananthanarayanan, Adam Wierman, and Minlan Yu. 2015. Hopper: Decentralized speculation-aware cluster scheduling at scale. In SIGCOMM.
[51]
Robert W Robinson. 1977. Counting unlabeled acyclic digraphs. In Combinatorial mathematics V. Springer.
[52]
Savvas Savvides. 2018. tpch-spark. https://github.com/ssavvides/tpch-spark
[53]
Bikash Sharma, Victor Chudnovsky, Joseph L Hellerstein, Rasekh Rifaat, and Chita R Das. 2011. Modeling and synthesizing task placement constraints in Google compute clusters. In SoCC.
[54]
Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, et al. 2009. Hive: A Warehousing Solution over a Mapreduce Framework. VLDB (2009).
[55]
Huangshi Tian, Yunchuan Zheng, and Wei Wang. 2019. Alibaba DAG Trace Generator. https://github.com/All-less/trace-generator.
[56]
Huangshi Tian, Yunchuan Zheng, and Wei Wang. 2019. Characterizing and Synthesizing Task Dependencies of Data-Parallel Jobs in Alibaba Cloud. Technical Report. HKUST. https://www.cse.ust.hk/~weiwa/papers/huangshi-socc19-techreport.pdf
[57]
Alexey Tumanov, Timothy Zhu, Jun Woo Park, et al. 2016. TetriSched: global rescheduling with adaptive plan-ahead in dynamic heterogeneous clusters. In EuroSys.
[58]
Luping Wang and Wei Wang. 2018. Fair coflow scheduling without prior knowledge. In IEEE 38th International Conference on Distributed Computing Systems (ICDCS).
[59]
Wencong Xiao, Romil Bhardwaj, Ramachandran Ramjee, et al. 2018. Gandiva: Introspective cluster scheduling for deep learning. In OSDI.
[60]
Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, et al. 2012. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI.
[61]
Matei Zaharia, Andy Konwinski, Anthony D Joseph, Randy H Katz, and Ion Stoica. 2008. Improving MapReduce performance in heterogeneous environments. In OSDI.
[62]
Haoyu Zhang, Logan Stafman, Andrew Or, and Michael J Freedman. 2017. Slaq: quality-driven scheduling for distributed machine learning. In SoCC.
[63]
Xiaoda Zhang, Zhuzhong Qian, Sheng Zhang, Xiangbo Li, Xiaoliang Wang, and Sanglu Lu. 2018. COBRA: Toward Provably Efficient Semi-Clairvoyant Scheduling in Data Analytics Systems. In Proc. IEEE INFOCOM.
[64]
Zhuo Zhang, Chao Li, Yangyu Tao, Renyu Yang, Hong Tang, and Jie Xu. 2014. Fuxi: a fault-tolerant resource management and job scheduling system at internet scale. VLDB (2014).
[65]
Yunchuan Zheng. 2019. TPC-DS on Spark. https://github.com/SimonZYC/tpcds-spark.

Cited By

View all
  • (2024)Octopus: An End-to-end Multi-DAG Scheduling Method Based on Deep Reinforcement Learning2024 43rd Chinese Control Conference (CCC)10.23919/CCC63176.2024.10662729(2588-2593)Online publication date: 28-Jul-2024
  • (2024)Batch Jobs Load Balancing Scheduling in Cloud Computing Using Distributional Reinforcement LearningIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.333451935:1(169-185)Online publication date: 1-Jan-2024
  • (2024)Delay Analysis of Multi-Priority Computing Tasks in Alibaba Cluster TracesIEEE INFOCOM 2024 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)10.1109/INFOCOMWKSHPS61880.2024.10620866(1-6)Online publication date: 20-May-2024
  • Show More Cited By

Index Terms

  1. Characterizing and Synthesizing Task Dependencies of Data-Parallel Jobs in Alibaba Cloud

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SoCC '19: Proceedings of the ACM Symposium on Cloud Computing
    November 2019
    503 pages
    ISBN:9781450369732
    DOI:10.1145/3357223
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 November 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Cloud Resource Scheduling
    2. Workload Analysis

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    SoCC '19
    Sponsor:
    SoCC '19: ACM Symposium on Cloud Computing
    November 20 - 23, 2019
    CA, Santa Cruz, USA

    Acceptance Rates

    SoCC '19 Paper Acceptance Rate 39 of 157 submissions, 25%;
    Overall Acceptance Rate 169 of 722 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)145
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 22 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Octopus: An End-to-end Multi-DAG Scheduling Method Based on Deep Reinforcement Learning2024 43rd Chinese Control Conference (CCC)10.23919/CCC63176.2024.10662729(2588-2593)Online publication date: 28-Jul-2024
    • (2024)Batch Jobs Load Balancing Scheduling in Cloud Computing Using Distributional Reinforcement LearningIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.333451935:1(169-185)Online publication date: 1-Jan-2024
    • (2024)Delay Analysis of Multi-Priority Computing Tasks in Alibaba Cluster TracesIEEE INFOCOM 2024 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)10.1109/INFOCOMWKSHPS61880.2024.10620866(1-6)Online publication date: 20-May-2024
    • (2024)Energy-Efficient Deployment of Stateful FaaS Vertical Applications on Edge Data Networks2024 33rd International Conference on Computer Communications and Networks (ICCCN)10.1109/ICCCN61486.2024.10637549(1-9)Online publication date: 29-Jul-2024
    • (2024)A Survey on Scheduling Techniques in Computing and Network ConvergenceIEEE Communications Surveys & Tutorials10.1109/COMST.2023.332902726:1(160-195)Online publication date: Sep-2025
    • (2024)Serverless application composition leveraging function fusion: Theory and algorithmsFuture Generation Computer Systems10.1016/j.future.2023.12.010153(403-418)Online publication date: Apr-2024
    • (2024)Mitigating interference of microservices with a scoring mechanism in large-scale clustersThe Journal of Supercomputing10.1007/s11227-024-06534-781:1Online publication date: 30-Oct-2024
    • (2024)Energy-efficient DAG scheduling with DVFS for cloud data centersThe Journal of Supercomputing10.1007/s11227-024-06035-7Online publication date: 27-Mar-2024
    • (2024)GPU cluster dynamics: insights from Alibaba’s 2023 trace releaseComputing10.1007/s00607-024-01369-9107:1Online publication date: 20-Nov-2024
    • (2024)PeersimGym: An Environment for Solving the Task Offloading Problem with Reinforcement LearningMachine Learning and Knowledge Discovery in Databases. Applied Data Science Track10.1007/978-3-031-70378-2_3(38-54)Online publication date: 22-Aug-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media