Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Multi-Attributes-Based Coflow Scheduling Without Prior Knowledge

Published: 01 August 2018 Publication History

Abstract

In data centers, the coflow abstraction is proposed to better express the requirements and communication semantics of a group of parallel flows generated by the jobs of cluster computing frameworks. Knowing the coflow-level information, such as coflow size, previous coflow scheduling proposals improve the performance over flow-level scheduling schemes. Recently, since some information of coflow is difficult to obtain in cloud environments, designing coflow scheduling mechanisms with partial or even without any information attracts much attention. However, existing information-agnostic mechanisms are generally built on the least attained service heuristic algorithm that schedules coflows only according to the sent bytes of different coflows, and they all ignore other useful coflow-level information like width, length, and communication patterns. In this paper, we investigate that the coflow completion time could be further decreased by jointly leveraging multiple coflow-level attributes. Based on this investigation, we present a Multiple-attributes-based Coflow Scheduling MCS mechanism to reduce the coflow completion time. In MCS, at the start of a coflow, a shortest and narrowest coflow first algorithm is designed to assign the initial priority based on the coflow width. During the transmission of coflows, based on the sent bytes of coflows, we proposed a double-threshold scheme to adjust the priorities of different classes of coflows according to different thresholds. Accordingly, the optimal thresholds are analyzed by using the M/M/1 queuing model. Testbed evaluations and simulations with production workloads show that MCS outperforms the previous information-agnostic scheduler Aalo, and reduces the completion time of small coflows.

References

[1]
J. Dean and S. Ghemawat, "MapReduce: Simplified data processing on large clusters," Commun. ACM, vol. 51, no. 1, pp. 107-113, 2008.
[2]
M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly, "Dryad: Distributed data-parallel programs from sequential building blocks," in Proc. ACM SIGOPS, 2007, pp. 59-72.
[3]
M. Zaharia et al., "Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing," in Proc. USENIX NSDI, 2012, p. 2.
[4]
M. Chowdhury, Y. Zhong, and I. Stoica, "Efficient coflow scheduling with varys," in Proc. ACM SIGCOMM, 2014, pp. 443-454.
[5]
M. Chowdhury and I. Stoica, "Coflow: A networking abstraction for cluster applications," in Proc. ACM HotNets, 2012, pp. 31-36.
[6]
M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica, "Managing data transfers in computer clusters with orchestra," in Proc. ACM SIGCOMM, 2011, pp. 98-109.
[7]
Y. Zhao et al., "Rapier: Integrating routing and scheduling for coflow-aware data center networks," in Proc. IEEE INFOCOM, Apr./May 2015, pp. 424-432.
[8]
L. Chen, W. Cui, B. Li, and B. Li, "Optimizing coflow completion times with utility max-min fairness," in Proc. IEEE INFOCOM, Apr. 2016, pp. 1-9.
[9]
Z. Li, Y. Zhang, D. Li, K. Chen, and Y. Peng, "OPTAS: Decentralized flow monitoring and scheduling for tiny tasks," in Proc. IEEE INFOCOM, Apr. 2016, pp. 1-9.
[10]
M. Chowdhury and I. Stoica, "Efficient coflow scheduling without prior knowledge," in Proc. ACM SIGCOMM, 2015, pp. 393-406.
[11]
F. R. Dogar, T. Karagiannis, H. Ballani, and A. Rowstron, "Decentralized task-aware scheduling for data center networks," in Proc. ACM SIGCOMM, 2014, pp. 431-442.
[12]
Z. Huang et al., "Need for speed: CORA scheduler for optimizing completion-times in the cloud," in Proc. IEEE INFOCOM, Apr./May 2015, pp. 891-899.
[13]
H. Zhang et al., "CODA: Toward automatically identifying and scheduling coflows in the dark," in Proc. ACM SIGCOMM, 2016, pp. 160-173.
[14]
Y. Gao, H. Yu, S. Luo, and S. Yu, "Information-agnostic coflow scheduling with optimal demotion thresholds," in Proc. IEEE ICC, May 2016, pp. 1-6.
[15]
W. Bai et al., "Information-agnostic flow scheduling for commodity data centers," in Proc. USENIX NSDI, 2015, pp. 455-468.
[16]
Coflow-Benchmark. Accessed: May 26, 2018. [Online]. Available: https://github.com/coflow/coflow-benchmark
[17]
CoflowSim. Accessed: May 26, 2018. [Online]. Available: https://github.com/coflow/coflowsim
[18]
S. Luo et al., "Minimizing average coflow completion time with decentralized scheduling," in Proc. IEEE ICC, Jun. 2015, pp. 307-312.
[19]
Z. Qiu, C. Stein, and Y. Zhong, "Minimizing the total weighted completion time of coflows in datacenter networks," in Proc. ACM SPAA, 2015, pp. 294-303.
[20]
S. Wang et al., "Leveraging multiple coflow attributes for information-agnostic coflow scheduling," in Proc. IEEE ICC, May 2017, pp. 1-6.
[21]
N. J. D. Nagelkerke, "A note on a general definition of the coefficient of determination," Biometrika, vol. 78, no. 3, pp. 691-692, Sep. 1991.
[22]
T.-H. Benchmark. Accessed: May 26, 2018. [Online]. Available: http://www.tpc.org/tpch/default.asp
[23]
TPC-DS Benchmark. Accessed: May 26, 2018. [Online]. Available: http://www.tpc.org/tpcds/default.asp
[24]
Apache Hive. Accessed: May 26, 2018. [Online]. Available: https://hive.apache.org
[25]
Apache Hadoop. Accessed: May 26, 2018. [Online]. Available: http://hadoop.apache.org
[26]
How Many Maps and Reduces. Accessed: May 26, 2018. [Online]. Available: https://wiki.apache.org/hadoop/HowManyMapsAndReduces
[27]
SWIM Workload. Accessed: May 26, 2018. [Online]. Available: https://github.com/SWIMProjectUCB/SWIM/wiki
[28]
J. M. Bernardo and A. F. M. Smith, Bayesian Theory. Hoboken, NJ, USA: Wiley, 2001.

Cited By

View all
  • (2023)Toward Network-Aware Query Execution Systems in Large DatacentersIEEE Transactions on Network and Service Management10.1109/TNSM.2023.327316620:4(4494-4504)Online publication date: 1-Dec-2023
  • (2021)Fair and near-optimal coflow scheduling without prior knowledge of coflow sizeThe Journal of Supercomputing10.1007/s11227-020-03614-277:7(7690-7717)Online publication date: 1-Jul-2021
  • (2019)Efficient Scheduling of Weighted Coflows in Data CentersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2019.290556030:9(2003-2017)Online publication date: 6-Aug-2019
  1. Multi-Attributes-Based Coflow Scheduling Without Prior Knowledge

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image IEEE/ACM Transactions on Networking
        IEEE/ACM Transactions on Networking  Volume 26, Issue 4
        August 2018
        471 pages

        Publisher

        IEEE Press

        Publication History

        Published: 01 August 2018
        Published in TON Volume 26, Issue 4

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)1
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 17 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)Toward Network-Aware Query Execution Systems in Large DatacentersIEEE Transactions on Network and Service Management10.1109/TNSM.2023.327316620:4(4494-4504)Online publication date: 1-Dec-2023
        • (2021)Fair and near-optimal coflow scheduling without prior knowledge of coflow sizeThe Journal of Supercomputing10.1007/s11227-020-03614-277:7(7690-7717)Online publication date: 1-Jul-2021
        • (2019)Efficient Scheduling of Weighted Coflows in Data CentersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2019.290556030:9(2003-2017)Online publication date: 6-Aug-2019

        View Options

        Login options

        Full Access

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media