Nothing Special   »   [go: up one dir, main page]

skip to main content
article

MapTask scheduling in mapreduce with data locality: throughput and heavy-traffic optimality

Published: 01 February 2016 Publication History

Abstract

MapReduce/Hadoop framework has been widely used to process large-scale datasets on computing clusters. Scheduling map tasks with data locality consideration is crucial to the performance of MapReduce. Many works have been devoted to increasing data locality for better efficiency. However, to the best of our knowledge, fundamental limits of MapReduce computing clusters with data locality, including the capacity region and theoretical bounds on the delay performance, have not been well studied. In this paper, we address these problems from a stochastic network perspective. Our focus is to strike the right balance between data locality and load balancing to simultaneously maximize throughput and minimize delay. We present a new queueing architecture and propose a map task scheduling algorithm constituted by the Join the Shortest Queue policy together with the MaxWeight policy. We identify an outer bound on the capacity region, and then prove that the proposed algorithm can stabilize any arrival rate vector strictly within this outer bound. It shows that the outer bound coincides with the actual capacity region, and the proposed algorithm is throughput-optimal. Furthermore, we study the number of backlogged tasks under the proposed algorithm, which is directly related to the delay performance based on Little's law. We prove that the proposed algorithm is heavy-traffic optimal, i.e., it asymptotically minimizes the number of back-logged tasks as the arrival rate vector approaches the boundary of the capacity region. Therefore, the proposed algorithm is also delay-optimal in the heavy-traffic regime. The proofs in this paper deal with random processing times with heterogeneous parameters and nonpreemptive task execution, which differentiate our work from many existing works on MaxWeight-type algorithms, so the proof techniques themselves for the stability analysis and the heavy-traffic analysis are also novel contributions.

References

[1]
J. Dean and S. Ghemawat, "Mapreduce: simplified data processing on large clusters," ACM Commun., vol. 51, no. 1, pp. 107-113, Jan. 2008.
[2]
"Hadoop," 2014 [Online]. Available: http://hadoop.apache.org
[3]
G. Ananthanarayanan et al., "Scarlett: Coping with skewed content popularity in mapreduce clusters," in Proc. EuroSys, Salzburg, Austria, 2011, pp. 287-300.
[4]
S. Ghemawat, H. Gobioff, and S.-T. Leung, "The Google file system," in Proc. ACM SOSP, Bolton Landing, NY, USA, 2003, pp. 29-43.
[5]
K. Shvachko, H. Kuang, S. Radia, and R. Chansler, "The Hadoop distributed file system," in Proc. IEEE MSST, Incline Village, NV, USA, May 2010, pp. 1-10.
[6]
T. White, Hadoop: The Definitive Guide. Sunnyvale, CA, USA: Yahoo Press, 2010.
[7]
M. Zaharia et al., "Delay scheduling: A simple technique for achieving locality and fairness in cluster scheduling," in Proc. EuroSys, Paris, France, 2010, pp. 265-278.
[8]
J. Polo et al., "Resource-aware adaptive scheduling for mapreduce clusters," in Proc. ACM/IFIP/USENIX Int. Conf. Middleware, Lisbon, Portugal, 2011, pp. 187-207.
[9]
J. Jin, J. Luo, A. Song, F. Dong, and R. Xiong, "Bar: An efficient data locality driven task scheduling algorithm for cloud computing," in Proc. IEEE/ACM CCGRID, Newport Beach, CA, USA, 2011, pp. 295-304.
[10]
Q. Xie and Y. Lu, "Degree-guided map-reduce task assignment with data locality constraint," in Proc. IEEE ISIT, Cambridge, MA, USA, 2012, pp. 985-989.
[11]
M. Isard et al., "Quincy: Fair scheduling for distributed computing clusters," in Proc. ACM SOSP, Big Sky, MT, USA, 2009, pp. 261-276.
[12]
S. Kavulya, J. Tan, R. Gandhi, and P. Narasimhan, "An analysis of traces from a production MapReduce cluster," in Proc. IEEE/ACM CCGRID, Melbourne, Australia, 2010, pp. 94-103.
[13]
Y. Chen, S. Alspaugh, D. Borthakur, and R. Katz, "Energy efficiency for large-scale mapreduce workloads with significant interactive analysis," in Proc. EuroSys, Bern, Switzerland, 2012, pp. 43-56.
[14]
M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz, and I. Stoica, "Improving mapreduce performance in heterogeneous environments," in Proc. USENIX OSDI, San Diego, CA, USA, 2008, pp. 29-42.
[15]
F. Chen, M. Kodialam, and T. V. Lakshman, "Joint scheduling of processing and shuffle phases in MapReduce systems," in Proc. IEEE INFOCOM, Orlando, FL, USA, Mar. 2012, pp. 1143-1151.
[16]
J. Tan, X. Meng, and L. Zhang, "Coupling task progress for MapReduce resource-aware scheduling," in Proc. IEEE INFOCOM, Turin, Italy, Apr. 2013, pp. 1618-1626.
[17]
M. Lin, L. Zhang, A. Wierman, and J. Tan, "Joint optimization of overlapping phases in MapReduce," in Proc. IFIP Performance, Vienna, Austria, 2013, vol. 70, no. 10, pp. 720-735.
[18]
L. Tassiulas and A. Ephremides, "Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio networks," IEEE Trans. Autom. Control, vol. 37, no. 12, pp. 1936-1948, Dec. 1992.
[19]
W. Wang, K. Zhu, L. Ying, J. Tan, and L. Zhang, "Map task scheduling in MapReduce with data locality: Throughput and heavy-traffic optimality," in Proc. IEEE INFOCOM, Turin, Italy, 2013, pp. 1609-1617.
[20]
W. Wang, K. Zhu, L. Ying, J. Tan, and L. Zhang, "A throughput optimal algorithm for map task scheduling in mapreduce with data locality," Perform. Eval. Rev., vol. 40, no. 4, pp. 33-42, Mar. 2013.
[21]
S. T. Maguluri and R. Srikant, "Scheduling jobs with unknown duration in clouds," in Proc. IEEE INFOCOM, Turin, Italy, 2013, pp. 1887-1895.
[22]
A. Eryilmaz and R. Srikant, "Asymptotically tight steady-state queue length bounds implied by drift conditions," Queueing Syst., vol. 72, no. 3-4, pp. 311-359, Dec. 2012.
[23]
S. T. Maguluri, R. Srikant, and L. Ying, "Heavy traffic optimal resource allocation algorithms for cloud computing clusters," in Proc. ITC, Krakow, Poland, 2012, pp. 1-8.
[24]
R. Srikant and L. Ying, Communication Networks: An Optimization, Control and Stochastic Networks Perspective. New York, NY, USA: Cambridge Univ. Press, 2014.
[25]
J. F. C. Kingman, "Some inequalities for the queue GI/G/1," Biometrika, vol. 49, no. 3-4, pp. 315-324, Dec. 1962.
[26]
B. Hajek, "Hitting-time and occupation-time bounds implied by drift analysis with applications," Ann. Appl. Prob., vol. 14, no. 3, pp. 502-525, 1982.
[27]
G. Ananthanarayanan et al., "Pacman: Coordinated memory caching for parallel jobs," in Proc. USENIX NSDI, 2012, pp. 20-20.
[28]
R. G. Gallager, Discrete Stochastic Processes, ser. The Springer International Series in Engineering and Computer Science. New York, NY, USA: Springer, 1996.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Networking
IEEE/ACM Transactions on Networking  Volume 24, Issue 1
February 2016
635 pages
ISSN:1063-6692
  • Editor:
  • R. Srikant
Issue’s Table of Contents

Publisher

IEEE Press

Publication History

Published: 01 February 2016
Accepted: 09 September 2014
Revised: 22 April 2014
Received: 07 August 2013
Published in TON Volume 24, Issue 1

Author Tags

  1. MapReduce
  2. heavy-traffic analysis
  3. queueing systems
  4. scheduling
  5. throughput optimality

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Server Saturation in Skewed NetworksProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/36560188:2(1-37)Online publication date: 29-May-2024
  • (2024)Fully-Dynamic Load BalancingInteger Programming and Combinatorial Optimization10.1007/978-3-031-59835-7_14(182-195)Online publication date: 3-Jul-2024
  • (2023)Load Balancing Under Strict Compatibility ConstraintsMathematics of Operations Research10.1287/moor.2022.125848:1(227-256)Online publication date: 1-Feb-2023
  • (2023)Cost-based Data Prefetching and Scheduling in Big Data Platforms over Tiered Storage SystemsACM Transactions on Database Systems10.1145/362538948:4(1-40)Online publication date: 13-Nov-2023
  • (2023)DSORL: Data Source Optimization With Reinforcement Learning Scheme for Vehicular Named Data NetworksIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.329203324:10(11225-11237)Online publication date: 1-Oct-2023
  • (2022)Scheduling Parallel-Task Jobs Subject to Packing and Placement ConstraintsOperations Research10.1287/opre.2021.219870:6(3403-3419)Online publication date: 1-Nov-2022
  • (2022)Heavy-Traffic Insensitive Bounds for Weighted Proportionally Fair Bandwidth Sharing PoliciesMathematics of Operations Research10.1287/moor.2021.122547:4(2691-2720)Online publication date: 1-Nov-2022
  • (2022)Load Balancing Under Strict Compatibility ConstraintsACM SIGMETRICS Performance Evaluation Review10.1145/3543516.345627549:1(51-52)Online publication date: 7-Jun-2022
  • (2021)TridentProceedings of the VLDB Endowment10.14778/3461535.346154514:9(1570-1582)Online publication date: 22-Oct-2021
  • (2021)Optimal Load Balancing with Locality ConstraintsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/34283304:3(1-37)Online publication date: 15-Jun-2021
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media