Replica-aware task scheduling and load balanced cache placement for delay reduction in multi-cloud environment

Chunlin Li^1,2,
Jing Zhang^2,3 &
Hengliang Tang⁴

595 Accesses
Explore all metrics

Abstract

With the development of content-sharing and collaborative computing services such as online social networks, scientific workflow, there are huge amounts of data generated. To process this tremendous amount of data, multi-cloud system that integrates multiple clouds together to provide a unified service in a collaborative manner has been introduced. However, task scheduling in such heterogeneous multi-cloud environment is very challenging. To reduce response delay caused by cross-data centers file access, we proposed a replica-aware task scheduling algorithm based on data replication. For speeding up data access in multi-cloud cooperative caches, we presented a load balanced cache placement algorithm based on Bayesian networks. In our scheduling algorithm, combined transferring computation with transferring data, resource matching is accomplished according to node locality. Only non-local unassigned and failed map tasks’ input data are replicated and transferred in advance to target nodes to expedite task execution. In our cache placement method, based on Bayesian networks the next execute task is predicted. In accordance with caching profit and recycling cost, cache prefetching files are selected. For each prefetching file, according to load balancing, target placement node is determined. Extensive experimental results show that the performance of our proposed replica-aware task scheduling algorithm is better than benchmark scheduling algorithms in terms of node locality ratio and job response time, and our load balanced cache placement algorithm outperforms the baseline caching algorithms in performance of prefetching hit ratio and execution time saving ratio.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical data replication strategy to improve performance in cloud computing

Article 04 December 2020

Adaptive data replication strategy in cloud computing for performance improvement

Article 23 June 2016

Performance Improvement of MapReduce for Heterogeneous Clusters Based on Efficient Locality and Replica Aware Scheduling (ELRAS) Strategy

Article 13 January 2017

References

Yang JY, Yang MQ, Zhu MM et al (2008) Promoting synergistic research and education in genomics and bioinformatics. BMC Genom 9(1):I1
Article MathSciNet Google Scholar
Yang MQ, Athey BD, Arabnia HR et al (2009) High-throughput next-generation sequencing technologies foster new cutting-edge computing techniques in bioinformatics. BMC Genom 10(1):I1
Article Google Scholar
Arabnia HR, Taha TR (1998) A parallel numerical algorithm on a reconfigurable multi-ring network. Telecommun Syst 10(1–2):185–202
Article Google Scholar
Ehandarkar SM, Arabnia HR (1997) Parallel computer vision on a reconfigurable multiprocessor network. IEEE Trans Parallel Distrib Syst 8(3):292–309
Article Google Scholar
Chaudhary R, Aujla GS, Kumar N et al (2018) Optimized big data management across multi-cloud data centers: software-defined-network-based analysis. IEEE Commun Mag 56(2):118–126
Article Google Scholar
Nikolaou S, Van Renesse R, Schiper N (2016) Proactive cache placement on cooperative client caches for online social networks. IEEE Trans Parallel Distrib Syst 27(4):1174–1186
Article Google Scholar
Motavaselalhagh F, Esfahani FS, Arabnia HR (2015) Knowledge-based adaptable scheduler for SaaS providers in cloud computing. Hum Centric Comput Inf Sci 5(1):16
Article Google Scholar
Tang Z, Liu M, Ammar A et al (2016) An optimized MapReduce workflow scheduling algorithm for heterogeneous computing. J Supercomput 72(6):2059–2079
Article Google Scholar
Cai X, Li F, Li P et al (2017) SLA-aware energy-efficient scheduling scheme for Hadoop YARN. J Supercomput 73(8):3526–3546
Article Google Scholar
Hashem IAT, Anuar NB, Marjani M et al (2018) Multi-objective scheduling of MapReduce jobs in big data processing. Multimed Tools Appl 77(8):9979–9994
Article Google Scholar
Li C, Zhu L, Liu Y et al (2017) Resource scheduling approach for multimedia cloud content management. J Supercomput 73(12):5150–5172
Article Google Scholar
Yildiz O, Ibrahim S, Antoniu G (2017) Enabling fast failure recovery in shared Hadoop clusters: towards failure-aware scheduling. Future Gener Comput Syst 74:208–219
Article Google Scholar
Nguyen MC et al (2017) Prefetching-based metadata management in Advanced Multitenant Hadoop. J Supercomput 2017(2):1–21
MathSciNet Google Scholar
Xie Q, Pundir M, Lu Y et al (2017) Pandas: robust locality-aware scheduling with stochastic delay optimality. IEEE/ACM Trans Netw (TON) 25(2):662–675
Article Google Scholar
Naik NS, Negi A, Tapas Bapu BR et al (2019) A data locality based scheduler to enhance MapReduce performance in heterogeneous environments. Future Gener Comput Syst 90:423–434
Article Google Scholar
Kaur K, Kumar N, Garg S et al (2018) EnLoc: data locality-aware energy-efficient scheduling scheme for cloud data centers. In: 2018 IEEE International Conference on Communications (ICC). IEEE, pp 1–6
Convolbo MW et al (2018) GEODIS: towards the optimization of data locality-aware job scheduling in geo-distributed data centers. Computing 100(1):21–46
Article MathSciNet MATH Google Scholar
Sahoo J, Salahuddin MA, Glitho R et al (2016) A survey on replica server placement algorithms for content delivery networks. IEEE Commun Surv Tutor 19(2):1002–1026
Article Google Scholar
Chae SH, Quek TQS, Choi W (2017) Content placement for wireless cooperative caching helpers: a tradeoff between cooperative gain and content diversity gain. IEEE Trans Wirel Commun 16(10):6795–6807
Article Google Scholar
Chae SH, Choi W (2016) Caching placement in stochastic wireless caching helper networks: channel selection diversity via caching. IEEE Trans Wirel Commun 15(10):6626–6637
Article Google Scholar
Li C, Toni L, Zou J et al (2018) QoE-driven mobile edge caching placement for adaptive video streaming. IEEE Trans Multimed 20:965–984
Article Google Scholar
Song J, Song H, Choi W (2017) Optimal content placement for wireless femto-caching network. IEEE Trans Wirel Commun 16(7):4433–4444
Article Google Scholar
Liu J, Bai B, Zhang J et al (2017) Cache placement in Fog-RANs: from centralized to distributed algorithms. IEEE Trans Wirel Commun 16(11):7039–7051
Article Google Scholar
Sung J, Kim M, Lim K et al (2016) Efficient cache placement strategy in two-tier wireless content delivery network. IEEE Trans Multimed 18(6):1163–1174
Article Google Scholar
Poularakis K, Tassiulas L (2016) On the complexity of optimal content placement in hierarchical caching networks. IEEE Trans Commun 64(5):2092–2103
Article Google Scholar
Kovács J, Kacsuk P (2018) Occopus: a multi-cloud orchestrator to deploy and manage complex scientific infrastructures. J Grid Comput 16(1):19–37
Article Google Scholar
Moreno-Vozmediano R, Montero RS, Huedo E et al (2018) Orchestrating the deployment of high availability services on multi-zone and multi-cloud scenarios. J Grid Comput 16(1):39–53
Article Google Scholar
Guerrero C, Lera I, Juiz C (2018) Resource optimization of container orchestration: a case study in multi-cloud microservices-based applications. J Supercomput 74(7):1–28
Article Google Scholar
Bruno R, Costa F, Ferreira P (2017) freeCycles-efficient multi-cloud computing platform. J Grid Comput 15(4):501–526
Article Google Scholar
Panda SK, Gupta I, Jana PK (2017) Task scheduling algorithms for multi-cloud systems: allocation-aware approach. Inf Syst Front 1–19
Panda SK, Jana PK (2017) SLA-based task scheduling algorithms for heterogeneous multi-cloud environment. J Supercomput 73(6):2730–2762
Article Google Scholar
Thirumalaiselvan C, Venkatachalam V (2017) A strategic performance of virtual task scheduling in multi cloud environment. Clust Comput. https://doi.org/10.1007/s10586-017-1268-7
Google Scholar
Kang S, Veeravalli B, Aung KMM (2018) Dynamic scheduling strategy with efficient node availability prediction for handling divisible loads in multi-cloud systems. J Parallel Distrib Comput 113:1–16
Article Google Scholar
Kavulya S, Tan J, Gandhi R et al (2010) An analysis of traces from a production MapReduce cluster. In: 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid). IEEE, pp 94–103
Fair Scheduler. https://issues.apache.org/jira/browse/HADOOP-3746. Accessed 17 Feb 2016
Abad CL, Lu Y, Campbell RH (2011) DARE: adaptive data replication for efficient cluster scheduling. In: 2011 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, pp 159–168
Chen Y, Ganapathi A, Griffith R et al (2011) The case for evaluating MapReduce performance using workload suites. In: IEEE 19th International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS 2011). IEEE, pp 390–399
Arlitt M, Cherkasova L, Dilley J, Friedrich R, Jin T (2000) Evaluating content management techniques for Web proxy caches. ACM SIGMETRICS Perform Eval Rev 27(4):3–11
Article Google Scholar
Kim E, Liu JCL (2017) An integrated prefetching/caching scheme in multimedia servers. J Netw Comput Appl 88:1–21
Article Google Scholar

Download references

Acknowledgements

The work was supported by the National Natural Science Foundation (NSF) under Grants (Nos. 61672397, 61873341, 61472294, 61771354), Application Foundation Frontier Project of WuHan (No. 2018010401011290), the Young Teachers’ Scientific Research Ability Promotion Project of Huanghuai University (No. 2017LX09), Beijing Intelligent Logistics System Collaborative Innovation Center Open Project (No. BILSCIC-2018KF-02), Key Laboratory of Agricultural Remote Sensing [2017002], Beijing Youth Top-notch Talent Plan of High-Creation Plan (No. 2017000026833ZK25), Canal Plan-Leading Talent Project of Beijing Tongzhou District (No. YHLB2017038), and Beijing Key Laboratory of Intelligent Logistics System (No. BZ0211). Any opinions, findings, and conclusions are those of the authors and do not necessarily reflect the views of the above agencies.

Author information

Authors and Affiliations

Key Laboratory of Agricultural Remote Sensing, Ministry of Agriculture, Beijing, 100081, People’s Republic of China
Chunlin Li
Department of Computer Science, Wuhan University of Technology, Wuhan, 430063, People’s Republic of China
Chunlin Li & Jing Zhang
International College, Huanghuai University, Zhumadian, 463000, China
Jing Zhang
School of Information, Beijing Wuzi University, Beijing, 101149, China
Hengliang Tang

Authors

Chunlin Li
View author publications
You can also search for this author in PubMed Google Scholar
Jing Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hengliang Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chunlin Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, C., Zhang, J. & Tang, H. Replica-aware task scheduling and load balanced cache placement for delay reduction in multi-cloud environment. J Supercomput 75, 2805–2836 (2019). https://doi.org/10.1007/s11227-018-2695-9

Download citation

Published: 17 November 2018
Issue Date: 01 May 2019
DOI: https://doi.org/10.1007/s11227-018-2695-9

Replica-aware task scheduling and load balanced cache placement for delay reduction in multi-cloud environment

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Hierarchical data replication strategy to improve performance in cloud computing

Adaptive data replication strategy in cloud computing for performance improvement

Performance Improvement of MapReduce for Heterogeneous Clusters Based on Efficient Locality and Replica Aware Scheduling (ELRAS) Strategy

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Replica-aware task scheduling and load balanced cache placement for delay reduction in multi-cloud environment

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Hierarchical data replication strategy to improve performance in cloud computing

Adaptive data replication strategy in cloud computing for performance improvement

Performance Improvement of MapReduce for Heterogeneous Clusters Based on Efficient Locality and Replica Aware Scheduling (ELRAS) Strategy

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now