Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Enabling collaborative MapReduce on the Cloud with a single-sign-on mechanism

  • Published:
Computing Aims and scope Submit manuscript

Abstract

Cloud Computing introduces a novel computing paradigm that allows the users to run their applications on a customized environment using on-demand resources. This novel computing concept is enabled by several technologies including the Web, virtualization, distributed file systems as well as parallel programming models. For parallel computing on the Cloud, MapReduce is currently the first choice for Cloud providers to deliver data analysis services because this model is specially designed for data-intensive applications while a Cloud centre is actually also a data centre hosting a huge amount of data usually in Petascale. The current deployment of MapReduce on the Cloud, however, follows the traditional execution model of MapReduce that needs the support of a cluster manager. This means that the single virtual machines created on the Cloud have to be organized into a cluster in order to be capable of running a MapReduce application. This is not only a burden for system management but also prohibits inter-Cloud computing that can involve the resources of different Clouds to solve large problems with big data or distributed data. We developed a software framework for individual virtual machines to execute a MapReduce application in a parallel/collaborative way without the necessity of installing a middleware or specific software package for system management. A focus of this research work is a Single-Sign-On (SSON) mechanism that enables the remote access to the individual machines. We validated the SSON mechanism together with the entire MapReduce framework using a private Cloud. Experimental results show both the functionality and the feasibility of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Alhamazani K, Ranjan R, Mitra K, Rabhi F, Khan S.U, Guabtni A, Bhatnagar V (2013) An overview of the commercial cloud monitoring tools: research dimensions, design issues, and state-of-the-art. CoRR. http://arxiv.org/abs/1312.6170

  2. Amazon (2013) Amazon elastic compute cloud. http://aws.amazon.com/ec2/

  3. Bing T, Moca M, Chevalier S, Haiwu H, Fedak G (2010) Towards MapReduce for desktop grid computing. In: Proceedings of the international conference on P2P, parallel, grid, cloud and internet computing, pp 193–200

  4. Chandra R, Dagum L, Kohr D, Maydan D, McDonald J, Menon R (2001) Parallel programming in OpenMP. Morgan Kaufmann, Los Altos, CA. ISBN:1-55860-671-8

  5. Chen D, Li D, Xiong M, Bao H, Li X (2010) GPGPU-aided ensemble empirical mode decomposition for EEG analysis during anaesthesia. IEEE Trans Inf Technol BioMed 14(6):1417–1427

    Article  Google Scholar 

  6. Chen D, Wang L, Ouyang G, Li X (2011) Massively parallel neural signal processing on a many-core platform. IEEE/AIP Mag Comput Sci Eng 13(6):42–51

    Article  Google Scholar 

  7. Chen D, Wang L, Wu X, Chen J, Khan S, Kolodziej J, Tian M, Huang F, Liu W (2013) Hybrid modelling and simulation of huge crowd over a hierarchical grid architecture. Futur Gener Comput Syst 29(5):1309–1317

    Article  Google Scholar 

  8. Costa F, Silva L, Dahlin M (2011) Volunteer cloud computing: MapReduce over the Internet. In: Proceedings of the IEEE international symposium on parallel and distributed processing workshops and Phd Forum, pp 1855–1862

  9. Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. J ACM Commun 51(1):107–113

    Article  Google Scholar 

  10. Dou A, Kalogeraki V, Gunopulos D, Mielikainen T, Tuulos V.H (2010) Misco: a mapreduce framework for mobile systems. In: Proceedings of the 3rd international conference on pervasive technologies related to assistive environments

  11. Fedak G, He H, Cappello F (2008) BitDew: a programmable environment for large-scale data management and distribution. In: Proceedings of the ACM/IEEE conference on supercomputing

  12. Gentzsch W (2001) Sun Grid Engine: towards creating a compute power grid. In: Proceedings of the 1st international symposium on cluster computing and the grid, pp 35–36. Washington, USA

  13. Ghemawat S, Gobioff H, Leung S (2003) The Google file system. In: Proceedings of the ACM symposium on operating systems principles, pp 29–43

  14. Globus: Grid security infrastructure (2013). http://www.globus.org/security/

  15. Hadoop: Apache Hadoop Project (2012). http://hadoop.apache.org/

  16. Hameed A, Khoshkbari A, Ranjan R, Khan S.U, Kolodziej J, Balaji P, Zeadally S, Malluhi QM, Tzirtas N, Vishnav A, Zomaya A (2014) A survey and taxonomy on energy efficient resource allocation techniques for cloud computing systems (accepted)

  17. He B, Fang W, Luo Q, Govindaraju N.K, Wang T (2008) Mars: a mapreduce framework on graphics processors. In: Proceedings of international conference on parallel architectures and compilation techniques, pp 260–269

  18. Ibrahim S, Jin H, Cheng B, Cao H, Wu S, Qi L (2009) CLOUDLET: towards mapreduce implementation on virtual machines. In: Proceedings of the ACM international symposium on high performance distributed computing, pp 65–66

  19. Keahey K, Freeman T (2008) Science clouds: early experiences in cloud computing for scientific applications. In: Proceedings of the first workshop on cloud computing and its applications

  20. Kolodziej J, Khan S, Wang L, Byrski A, Nasro M, Madani S (2013) Hierarchical genetic-based grid scheduling with energy optimization. Clust Comput. doi:10.1007/s10586-012-0226-7

  21. Kolodziej J, Khan S, Wang L, Kisiel-Dorohinicki M, Madani S (2012) Security, energy, and performance-aware resource allocation mechanisms for computational grids. Futur Gener Comput Syst. doi:10.1016/j.future.2012.09.009

  22. Kolodziej J, Khan S, Wang L, Zomaya A (2013) Energy efficient genetic-based schedulers in computational grids. Concurr Comput Pract Exp . doi:10.1002/cpe.2839

  23. Liu H, Orban D (2011) Cloud MapReduce: a MapReduce implementation on top of a cloud operating system. In: Proceedings of the international symposium on cluster, cloud and grid computing, pp 464–474

  24. Mell P, Grance T (2013) The NIST definition of cloud computing. http://csrc.nist.gov/publications/drafts/800-145/Draft-SP-800-145_cloud-definition

  25. Menzel M, Ranjan R, Wang L, Khan S, Chen J (2014) CloudGenius: a hybrid decison support method for automating the migration of web application clustes to public clouds (accepted)

  26. Miao Y, Wang L, Liu D (2013) A web 2.0-based scientific gateway for massive remote sensing image processing. Concurr Comput Pract Exp. doi:10.1002/cpe.3049

  27. Pacheco P (1996) Parallel programming with MPI. No. 978-1-55860-339-4 in ISBN. Morgan Kaufmann, Los Altos

  28. Ranger C, Raghuraman R, Penmetsa A, Bradski G, Kozyrakis C (2007) Evaluating MapReduce for multi-core and multiprocessor systems. In: Proceedings of the IEEE international symposium on high performance computer architecture, pp 13–24

  29. Ranjan R, Buyya R, Harwood A (2005) A case for cooperative and incentive based coupling of distributed clusters. In: Proceedings of the 7th IEEE international conference on cluster computing (Cluster 2005), pp 1–11. Boston, MS, USA

  30. Ranjan R, Buyya R, Nepal S, Georgakopulo D (2014) A note on resource orchestration for cloud computing (accepted)

  31. Rescorla E (2002) SSL and TLS designing adn building secure systems. Addison-Wesley, Reading

    Google Scholar 

  32. Roy I, Setty STV, Kilzer A, Shmatikov V, Witchel E (2010) Airavat: security and privacy for MapReduce. In: Proceedings of the 7th USENIX conference on networked systems design and implementation

  33. Shan Y, Wang B, Yan J, Wang Y, Xu N, Yang H (2010) FPMR: MapReduce framework on FPGA. In: Proceedings of the annual ACM/SIGDA international symposium on field programmable gate arrays, pp 93–102

  34. Shvachko K, Hairong K, Radia S, Chansler R (2010) The Hadoop distributed file system. In: Proceedings of the IEEE symposium on mass storage systems and technologies, pp 1–10

  35. Sotomayor B, Montero R, Llorente I, Foster I (2008) Capacity leasing in cloud systems using the OpenNebula engine. In: The first workshop on cloud computing and its applications

  36. Staples G (2006) TORQUE resource manager. In: Proceedings of the 2006 ACM/IEEE conference on supercomputing

  37. Tatebe O, Hiraga K, Soda N (2010) Gfarm grid file system. New Gener Comput 28(3):257–275

    Article  MATH  Google Scholar 

  38. Wang L, Chen D, Hu Y, Ma Y, Wang J (2013) Towards enabling cyberinfrastructure as a service in clouds. Comput Electr Eng 39(1):3–14

    Article  Google Scholar 

  39. Wang L, Chen D, Liu W, Ma Y, Wu Y, Deng Z (2013) Parallel simulation of threat management for urban water distribution systems with MapReduce in clouds. IEEE Mag Comput Sci Eng. doi:10.1109/MCSE.2012.89

  40. Wang L, Khan S, Chen D, Kolodziej J, Ranjan R, Xu C, Zomaya A (2013) Energy-aware parallel task scheduling in a cluster. Futur Gener Comput Syst 29(7):1661–1670

    Article  Google Scholar 

  41. Wang L, Khan S, Dayal J (2012) Thermal aware workload placement with task-temperature profiles in a data center. J Supercomput 61(3):780–803

    Article  Google Scholar 

  42. Wang L, Kunze M, Tao J, von Laszewski G (2011) Towards building a cloud for scientific applications. Adv Eng Softw 42(9):714–722

    Article  Google Scholar 

  43. Wang L, Laszewski G, Younge A, He X, Kunze M, Tao J, Fu C (2010) Cloud computing: a perspective study. New Gener Comput 28(2):137–146

    Article  MATH  Google Scholar 

  44. Wang L, Tao J, Ranjan R, Marten H, Streit A, Chen J, Chen D (2013) G-Hadoop: MapReduce across distributed data centers for data-intensive computing. Futur Gener Comput Syst 29(3):739–750

    Article  Google Scholar 

  45. Wei J, Liu D, Wang L (2013) A general metric and parallel framework for adaptive image fusion. Concurr Comput Pract Exp. doi:10.1002/cpe.3037

  46. Wei W, Du J, Yu T, Gu X (2009) SecureMR: a service integrity assurance framework for MapReduce. In: Proceedings of annual computer security applications conference, pp 73–82

  47. Zhao J, Wang L, Tao J, Chen J, Sun W, Ranjan RR, Kolodziej J, Streit A, Georgakopoulos D (2014) A security framework in G-Hadoop for big data computing across distributed Cloud data centres. J Comput Syst Sci. doi:10.1016/j.jcss.2014.02.006

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Tao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, J., Tao, J. & Streit, A. Enabling collaborative MapReduce on the Cloud with a single-sign-on mechanism. Computing 98, 55–72 (2016). https://doi.org/10.1007/s00607-014-0390-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-014-0390-0

Keywords

Mathematics Subject Classification

Navigation