Nothing Special   »   [go: up one dir, main page]

skip to main content
article

SAMES: deadline-constraint scheduling in MapReduce

Published: 01 February 2015 Publication History

Abstract

MapReduce is a popular parallel data-processing system, and task scheduling is one of the kernel techniques in MapReduce. In many applications, users have requirements that their MapReduce jobs should be completed before specific deadlines. Hence, in this paper, a novel scheduling algorithm based on the most effective sequence (SAMES) is proposed for deadline-constraint jobs in MapReduce. First, according to the characteristics of MapReduce, we propose a novel sequence-based execution strategy for MapReduce jobs and a new concept, the effective sequence (ES). Then, we design some efficient approaches for finding ESes and choose the most effective sequence (MES) for job execution. We also propose methods for MES-updates and exception handling. Finally, we verify the effectiveness of SAMES through experiments. The experimental results show that SAMES is an efficient scheduling algorithm for deadline-constraint jobs in MapReduce.

References

[1]
Dean J, Ghemawat S. Mapreduce: simplified data processing on large clusters. Communications of the ACM, 2008, 51(1): 107---113
[2]
Jiang D, Ooi B C, Shi L, Wu S. The performance of mapreduce: an in-depth study. Proceedings of the VLDB Endowment, 2010, 3(1---2): 472---483
[3]
Polo J, Carrera D, Becerra Y, Torres J. Performance-driven task coscheduling for mapreduce environments. In: Proceedings of the Network Operations and Managment Symposium (NOMS). 2010, 373---380
[4]
Kc K, Anyanwu K. Scheduling hadoop jobs to meet deadlines. In: Proceedings of 2010 IEEE Second International Conference on Cloud Computing Technology and Science (CloudCom). 2010, 388---392
[5]
Verma A, Cherkasova L, Kumar V S, Campbell R H. Deadline-based workload management for mapreduce environments: pieces of the performance puzzle. In: Proceedings of the Network Operations andManagment Symposium (NOMS). 2012, 900---905
[6]
Sandholm T, Lai K. Dynamic proportional share scheduling in hadoop. In: Proceedings of the Job Scheduling Strategies for Parallel Processing. Berlin: Springer, 2010, 110---131
[7]
Schwarzkopf M, Konwinski A, Abd-El-Malek M, Wilkes J. Omega: flexible, scalable schedulers for large compute clusters. In: Proceedings of the 8th ACM European Conference on Computer Systems, ACM. 2013, 351---364
[8]
Wolf J, Balmin A, Rajan D, Hildrum K, Khandekar R, Parekh S, Wu K L, Vernica R. Circumflex: a scheduling optimizer for mapreduce workloads with shared scans. SIGOPS, 2012, 46(1): 26---32
[9]
Morton K, Balazinska M, Grossman D. Paratimer: a progress indicator for mapreduce dags. In: SIGMOD Conference'10. 2010, 507---518
[10]
Condie T, Conway N, Alvaro P, Hellerstein J M. Mapreduce online. In: Proceedings of NSDI. 2010, 313---328
[11]
Zaharia M, Elmeleegy K, Borthakur D, Shenker S, Sen Sarma J, Stoica I. Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In: Proceedings of EuroSys, ACM. 2010, 265---278
[12]
Zaharia M, Konwinski A, Joseph A D, Katz R, Stoica I. Improving mapreduce performance in heterogeneous environments. In: Proceedings of OSDI. 2008, 29---42
[13]
Verma A, Cherkasova L, Campbell R H. Aria: automatic resource inference and allocation for mapreduce environments. In: Proceedings of the 8th ACM International Conference on Autonomic Computing, ACM. 2011, 235---244
[14]
Dou A, Kalogeraki V, Gunopulos D, Mielikainen T, Tuulos V H. Misco: a mapreduce framework for mobile systems. In: Proceedings of the 3rd International Conference on PErvasive Technologies Related to Assistive Environments, ACM. 2010, 32---39
[15]
Dou A J, Kalogeraki V, Gunopulos D, Mielikainen T, Tuulos V H. Scheduling for real-time mobile mapreduce systems. In: Proceedings of the 5th ACM International Conference on Distributed Event-based System. 2011, 347---358

Cited By

View all
  • (2023)MapReduce scheduling algorithms in Hadoop: a systematic studyJournal of Cloud Computing: Advances, Systems and Applications10.1186/s13677-023-00520-912:1Online publication date: 10-Oct-2023
  • (2020)MQWAGSProceedings of the 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence10.1145/3409501.3409515(42-48)Online publication date: 3-Jul-2020
  • (2020)Research on Job Scheduling Algorithms Based on Cloud ComputingGreen, Pervasive, and Cloud Computing10.1007/978-3-030-64243-3_36(481-495)Online publication date: 13-Nov-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Frontiers of Computer Science: Selected Publications from Chinese Universities
Frontiers of Computer Science: Selected Publications from Chinese Universities  Volume 9, Issue 1
February 2015
169 pages
ISSN:2095-2228
EISSN:2095-2236
Issue’s Table of Contents

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 February 2015

Author Tags

  1. MapReduce
  2. deadline
  3. scheduling

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)MapReduce scheduling algorithms in Hadoop: a systematic studyJournal of Cloud Computing: Advances, Systems and Applications10.1186/s13677-023-00520-912:1Online publication date: 10-Oct-2023
  • (2020)MQWAGSProceedings of the 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence10.1145/3409501.3409515(42-48)Online publication date: 3-Jul-2020
  • (2020)Research on Job Scheduling Algorithms Based on Cloud ComputingGreen, Pervasive, and Cloud Computing10.1007/978-3-030-64243-3_36(481-495)Online publication date: 13-Nov-2020
  • (2017)CRED: Cloud Right-Sizing with Execution Deadlines and Data LocalityIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2017.272607128:12(3389-3400)Online publication date: 9-Nov-2017
  • (2017)Two-Stage Job Scheduling Model Based on Revenues and ResourcesNetwork and Parallel Computing10.1007/978-3-319-68210-5_4(37-48)Online publication date: 20-Oct-2017
  • (2016)Distributed error estimation of functional dependencyInformation Sciences: an International Journal10.1016/j.ins.2016.01.051345:C(156-176)Online publication date: 1-Jun-2016

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media