Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Security-aware intermediate data placement strategy in scientific cloud workflows

Published: 01 November 2014 Publication History

Abstract

Massive computation power and storage capacity of cloud computing systems allow scientists to deploy data-intensive applications without the infrastructure investment, where large application datasets can be stored in the cloud. Based on the pay-as-you-go model, data placement strategies have been developed to cost-effectively store large volumes of generated datasets in the scientific cloud workflows. As promising as it is, this paradigm also introduces many new challenges for data security when the users outsource sensitive data for sharing on the cloud servers, which are not within the same trusted domain as the data owners. This challenge is further complicated by the security constraints on the potential sensitive data for the scientific workflows in the cloud. To effectively address this problem, we propose a security-aware intermediate data placement strategy. First, we build a security overhead model to reasonably measure the security overheads incurred by the sensitive data. Second, we develop a data placement strategy to dynamically place the intermediate data for the scientific workflows. Finally, our experimental results show that our strategy can effectively improve the intermediate data security while ensuring the data transfer time during the execution of scientific workflows.

References

[1]
Costantino Thanos (2012) Global research data infrastructures: towards a 10-year vision for global research data infrastructures. http://www.grdi2020.eu/Repository/FileScaricati/6bdc07fb-b21d-4b90-81d4-d909fdb96b87.pdf
[2]
European Commission High Level Expert Group on Scientific Data (2010) Riding the wave: how Europe can gain from the rising tide of scientific data. http://cordis.europa.eu/fp7/ict/e-infrastructure/docs/hlg-sdi-report.pdf
[3]
Demchenko Y, Grosso P, de Laat C et al. (2013) Addressing big data issues in scientific data infrastructure. In: Proceedings of the international conference on collaboration technologies and systems, pp 48---55
[4]
Sagiroglu S, Sinanc D (2013) Big data: a review. In: Proceedings of the international conference on collaboration technologies and systems, pp 42---47
[5]
Acar UA, Chen Y (2013) Streaming big data with self-adjusting computation. In: Proceedings of the workshop on data driven functional programming, pp 15---18
[6]
Baru C, Bhandarkar M, Nambiar R et al (2013) Setting the direction for big data benchmark standards. In: Nambiar R, Poess M (eds) Selected topics in performance evaluation and benchmarking. Lecture notes in computer science, Springer, Heidelberg, pp 197---208
[7]
Srivastava D, Dong XL (2013) Big data integration. In: Proceedings of the international conference on data engineering, pp 1245---1248
[8]
Fei X, Lu S (2012) A dataflow-based scientific workflow composition framework. IEEE Trans Serv Comput 5(1):45---58
[9]
Szalay A, Gray J (2006) 2020 Computing: science in an exponential world. Nature 440(7083):413---414
[10]
Deelman E, Gannon D, Shields M et al (2009) Workflows and e-Science: an overview of workflow system features and capabilities. Future Gener Comput Syst 25(5):528---540
[11]
Yuan D, Yang Y, Liu X et al (2013) A highly practical approach towards achieving minimum datasets storage cost in the cloud. IEEE Trans Parallel Distrib Syst 24(6):1234---1244
[12]
Yuan D, Yang Y, Liu X et al (2012) A data dependency based strategy for intermediate data storage in scientific cloud workflow systems. Concurr Comput Pract Exp 24(9):956---976
[13]
Bertram L, Ilkay A, Chad B et al (2006) Scientific workflow management and the Kepler system. Concurr Comput Pract Exp 18(10):1039---1065
[14]
Weiss A (2007) Computing in the clouds. ACM Netw 11(4):16---25
[15]
Foster I, Yong Z, Raicu I et al (2008) Cloud computing and grid computing 360-degree compared. In: Proceedings of the grid computing environments workshop, pp 1---10
[16]
Yuan D, Yang Y, Liu X et al (2010) A data placement strategy in scientific cloud workflows. Future Gener Comput Syst 26(8):1200---1214
[17]
Wan C, Wang C, Pei J (2012) A QoS-awared scientific workflow scheduling schema in cloud computing. In: Proceedings of international conference on information science and technology, pp 634---639
[18]
Wei L, Zhu H, Cao Z et al (2014) Security and privacy for storage and computation in cloud computing. Inf Sci 258(2):371---386
[19]
Chu CK, Zhu WT, Han J et al (2013) Security concerns in popular cloud storage services. IEEE Pervasive Comput 12(4):50---57
[20]
Kalloniatis C, Mouratidis H, Islam S (2013) Evaluating cloud deployment scenarios based on security and privacy requirements. Requir Eng 18(4):299---319
[21]
Xiong L, Goryczka S, Sunderam V (2011) Adaptive, secure, and scalable distributed data outsourcing: a vision paper. In: Proceedings of workshop on dynamic distributed data-intensive applications, pp 1---6
[22]
Mohamed EM, Abdelkader HS, El-Etriby S (2012) Enhanced data security model for cloud computing. In: Proceedings of 8th international conference on informatics and systems, pp 12---17
[23]
Kaufman LM (2009) Data security in the world of cloud computing. IEEE Secur Priv 7(4):61---64
[24]
Armbrust M, Fox A, Griffith R et al (2010) A view of cloud computing. Commun ACM 53(4):50---58
[25]
Saritha S (2010) Google File System. Dissertation, Cochin University of Science and Technology
[26]
Hadoop (2011) http://hadoop.apache.org/
[27]
Natarajan A (2013) User-oriented modeling of scientific workflows for high frequency event data analysis. In: Proceedings of the 29th IEEE international conference on data engineering workshops, pp 306---309
[28]
Guo L, He Z, Zhao S et al (2012) Multi-objective optimization for data placement strategy in cloud computing. In: Liu C, Wang L, Yang A (eds) Information computing and applications. Communications in computer and information science. Springer, Heidelberg, pp 119---126
[29]
Guo L, Zhao S, Shen S et al (2012) A particle swarm optimization for data placement strategy in cloud computing. In: Zhu R, Ma Y (eds) Information engineering and applications. Lecture notes in electrical engineering, vol 154. Springer, London, pp 946---953
[30]
Ma F, Yang Y, Li T (2012) A data placement method based on Bayesian network for data-intensive scientific workflows. In: Proceedings of the international conference on computer science and service system, pp 1811---1814
[31]
Er-Dum Z, Yong-Qiang Q, Xing-Xing X et al (2012) A data placement strategy based on genetic algorithm for scientific workflows. In: Proceedings of the 8th international conference on computational intelligence and security, pp 146---149
[32]
Liu S-W, Kong L-M, Ren K-J et al (2011) A two-step data placement and task scheduling strategy for optimizing scientific workflow performance on cloud computing platform. Chin J Comput 34(11):2121---2130
[33]
Xi R, Lin N, Chen Y et al (2011) Compression and aggregation of Bayesian estimates for data intensive computing. Knowl Inf Syst 33(1):191---212
[34]
Peng Z, Guiling W, Xu X (2013) A data placement approach for workflow in cloud. J Comput Res Dev 50(3):636---647
[35]
Zeng P, Cui L-Z, Wang H-Y et al (2010) A data placement strategy for data-intensive applications in cloud. Chin J Comput 33(8):1472---1480
[36]
Xie T, Qin X (2006) Scheduling security-critical real-time applications on clusters. IEEE Trans Comput 55(7):864---879
[37]
Bishop M (2003) What is computer security? IEEE Secur Priv 1(1):67---69
[38]
Xie T, Qin X (2007) Performance evaluation of a new scheduling algorithm for distributed systems with security heterogeneity. J Parallel Distrib Comput 67(10):1067---1081
[39]
Zhu X, Lu P (2009) A two-phase scheduling strategy for real-time applications with security requirements on heterogeneous clusters. Comput Electr Eng 35(6):980---993
[40]
Zhu X, Qin X, Qiu M (2011) QoS-aware fault-tolerant scheduling for real-time tasks on heterogeneous clusters. IEEE Trans Comput 60(6):800---812
[41]
Stutzle T, Dorigo M (2002) A short convergence proof for a class of ant colony optimization algorithms. IEEE Trans Evol Comput 6(4):358---365
[42]
Calheiros RN, Ranjan R, Beloglazov A et al (2011) CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw Pract Exp 41(1):23---50

Cited By

View all
  • (2024)Security challenges for workflow allocation model in cloud computing environment: a comprehensive survey, framework, taxonomy, open issues, and future directionsThe Journal of Supercomputing10.1007/s11227-023-05873-180:8(11491-11555)Online publication date: 1-May-2024
  • (2023)Security and privacy concerns in cloud-based scientific and business workflowsFuture Generation Computer Systems10.1016/j.future.2023.05.015148:C(184-200)Online publication date: 1-Nov-2023
  • (2023)A review of task scheduling in cloud computing based on nature-inspired optimization algorithmCluster Computing10.1007/s10586-023-04090-y26:5(3037-3067)Online publication date: 29-Jun-2023
  • Show More Cited By
  1. Security-aware intermediate data placement strategy in scientific cloud workflows

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Knowledge and Information Systems
    Knowledge and Information Systems  Volume 41, Issue 2
    November 2014
    305 pages

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 01 November 2014

    Author Tags

    1. Cloud computing
    2. Data placement
    3. Data security
    4. Scientific workflow

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Security challenges for workflow allocation model in cloud computing environment: a comprehensive survey, framework, taxonomy, open issues, and future directionsThe Journal of Supercomputing10.1007/s11227-023-05873-180:8(11491-11555)Online publication date: 1-May-2024
    • (2023)Security and privacy concerns in cloud-based scientific and business workflowsFuture Generation Computer Systems10.1016/j.future.2023.05.015148:C(184-200)Online publication date: 1-Nov-2023
    • (2023)A review of task scheduling in cloud computing based on nature-inspired optimization algorithmCluster Computing10.1007/s10586-023-04090-y26:5(3037-3067)Online publication date: 29-Jun-2023
    • (2019)Security and Cost-Aware Computation Offloading via Deep Reinforcement Learning in Mobile Edge ComputingWireless Communications & Mobile Computing10.1155/2019/38162372019Online publication date: 23-Dec-2019
    • (2019)Security modeling and efficient computation offloading for service workflow in mobile edge computingFuture Generation Computer Systems10.1016/j.future.2019.03.01197:C(755-774)Online publication date: 1-Aug-2019
    • (2019)On cloud security requirements, threats, vulnerabilities and countermeasuresComputer Science Review10.1016/j.cosrev.2019.05.00233:C(1-48)Online publication date: 1-Aug-2019
    • (2017)Scheduling for Workflows with Security-Sensitive Intermediate Data by Selective Tasks Duplication in CloudsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2017.267850728:9(2674-2688)Online publication date: 7-Aug-2017
    • (2017)A review of task scheduling based on meta-heuristics approach in cloud computingKnowledge and Information Systems10.1007/s10115-017-1044-252:1(1-51)Online publication date: 1-Jul-2017
    • (2017)Cloud resource allocation schemesKnowledge and Information Systems10.1007/s10115-016-0951-y50:2(347-381)Online publication date: 1-Feb-2017
    • (2016)Security-aware workflow scheduling with selective task duplication in cloudsProceedings of the 24th High Performance Computing Symposium10.22360/SpringSim.2016.HPC.048(1-8)Online publication date: 3-Apr-2016
    • Show More Cited By

    View Options

    View options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media