Privacy Preservation over Big Data in Cloud Systems

Xuyun Zhang³,
Chang Liu³,
Surya Nepal⁴,
Chi Yang³ &
…
Jinjun Chen³

3158 Accesses
19 Citations

Abstract

Cloud computing and Big Data, two disruptive trends at present, pose significant influence on current IT industry and research communities. Cloud computing provides massive computation power and storage capacity which enable users to deploy applications without infrastructure investment

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

eBook: USD 15.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Clouding Big Data: Information Privacy Considerations

Privacy in cloud computing environments: a survey and research challenges

Article 23 January 2017

Privacy in the Cloud

References

Amazon Web Services (2013) Amazon elastic MapReduce (Amazon EMR). http://aws.amazon.com/elasticmapreduce/. Accessed on 10 Mar 2013
Apache (2013) Apache Mahout machine learning library. http://mahout.apache.org/. Accessed on 10 Mar 2013
Bayardo RJ, Agrawal R (2005) Data privacy through optimal k-anonymization. In: Proceedings of the 21st IEEE international conference on data, engineering (ICDE’05), pp 217–228
Google Scholar
Bhatotia P, Wieder A, Rodrigues R, Acar UA, Pasquin R (2011) Incoop: MapReduce for incremental computations. In: Proceedings of the 2nd ACM symposium on cloud, computing (SoCC’11), pp 1–14
Google Scholar
Blass E-O, Pietro RD, Molva R, Önen M (2012) PRISM–privacy-preserving search in MapReduce. In: Proceedings of the 12th international conference on privacy enhancing technologies (PETS’12), pp 180–200
Google Scholar
Borkar V, Carey MJ, Li C (2012) Inside "Big Data Management": ogres, onions, or parfaits? In: Proceedings of the 15th international conference on extending database technology (EDBT’12), pp 3–14
Google Scholar
Bu Y, Howe B, Balazinska M, Ernst MD (2012) The HaLoop approach to large-scale iterative data analysis. VLDB J 21(2):169–190
Article Google Scholar
Cao N, Wang C, Li M, Ren K, Lou W (2011) Privacy-preserving multi-keyword ranked search over encrypted cloud data. In: Proceedings of the 31st annual IEEE international conference on, computer communications (INFOCOM’11), pp 829–837
Google Scholar
Chaudhuri S (2012) What next?: A half-dozen data management research goals for big data and the cloud. In: Proceedings of the 31st symposium on principles of database systems (PODS’12), pp 1–4
Google Scholar
Davidson SB, Khanna S, Milo T, Panigrahi D, Roy S (2011) Provenance views for module privacy. In: Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems (PODS’11), pp 175–186
Google Scholar
Dean J, Ghemawat S (2010) MapReduce: a flexible data processing tool. Commun ACM 53(1):72–77
Article Google Scholar
Doka K, Tsoumakos d, Koziris N (2011) KANIS: preserving k-anonymity over distributed data. In the 5th international workshop on personalized access, profile management, and context awareness in databases (PersDB’11), Seattle, WA, USA
Google Scholar
Dwork C (2006) Differential privacy. In: Proceedings of the 33rd international colloquium on automata, languages and programming (ICALP’06), pp 1–12
Google Scholar
Ekanayake J, Li H, Zhang B, Gunarathne T, Bae S-H, Qiu J, Fox G (2010) Twister: a runtime for iterative MapReduce. In: Proceedings of the 19th ACM international symposium on high performance, distributed computing (HDPC’10), pp 810–818
Google Scholar
Fung BCM, Wang K, Chen R, Yu PS (2010) Privacy-preserving data publishing: a survey of recent developments. ACM Comput Surv 42(4):1–53
Article Google Scholar
Fung BCM, Wang K, Yu PS (2007) Anonymizing classification data for privacy preservation. IEEE Trans Knowl Data Eng 19(5):711–725
Article Google Scholar
Gentry C (2009) Fully homomorphic encryption using ideal lattices. In: Proceedings of the 41st annual ACM symposium on theory of, computing (STOC’09), pp 169–178
Google Scholar
Hadoop (2013) http://hadoop.apache.org. Accessed on 10 Mar 2013
Hu H, Xu J, Ren C, Choi B (2011) Processing private queries over untrusted data cloud through privacy homomorphism. In: Proceedings of the IEEE 27th international conference on data, engineering (ICDE’11), pp 601–612
Google Scholar
Ko SY, Jeon K, Morales R (2011) The hybrex model for confidentiality and privacy in cloud computing. In: Proceedings of the 3rd USENIX conference on hot topics in cloud computing (HotCloud’11), article 8
Google Scholar
KVM (2013) http://www.linux-kvm.org/page/Main_Page. Accessed on 10 Mar 2013
LeFevre K, DeWitt DJ, Ramakrishnan R (2005) Incognito: efficient full-domain k-anonymity. In: Proceedings of 2005 ACM SIGMOD international conference on management of data (SIGMOD ’05), pp 49–60
Google Scholar
LeFevre K, DeWitt DJ, Ramakrishnan R (2006) Mondrian multidimensional k-anonymity. In: Proceedings of 22nd international conference on data engineering (ICDE ’06), article 25
Google Scholar
Li M, Yu S, Cao N, Lou W (2011) Authorized private keyword search over encrypted data in cloud computing. In: Proceedings of the 31st international conference on distributed, computing systems (ICDCS’11), pp 383–392
Google Scholar
Li N, Li T, Venkatasubramanian S (2010) Closeness: a new privacy measure for data publishing. IEEE Trans Knowl Data Eng 22(7):943–956
Article Google Scholar
Machanavajjhala A, Kifer D, Gehrke D, Venkitasubramaniam M (2007) L-diversity: privacy beyond k-anonymity. ACM Trans Knowl Discov Data 1(1): article 3
Google Scholar
Mell P, Grance T (2009) The NIST definition of cloud computing (Version 15). National Institute of Standards and Technology, Information Technology Laboratory, U.S.
Google Scholar
MicroSoft (2013) Windows Azure. http://www.windowsazure.com/en-us/. Accessed on 10 Mar 2013
Microsoft HealthVault (2013) http://www.microsoft.com/health/ww/products/Pages/healthvault.aspx. Accessed on 10 Mar 2013
Mohammed N, Fung B, Hung PCK, Lee CK (2010) Centralized and distributed anonymization for high-dimensional healthcare data. ACM Trans Knowl Discov Data 4(4): article 18
Google Scholar
Muniswamy-Reddy K-K, Macko P, Seltzer M (2010) Provenance for the cloud. In: Proceedings of the 8th USENIX conference on file and storage technologies (FAST’10), pp 197–210
Google Scholar
Neuman BC, Ts’o T (1994) Kerberos: an authentication service for computer networks. IEEE Commun Mag 32(9):33–38
Article Google Scholar
OpenStack (2013) http://openstack.org/. Accessed on 10 Mar 2013
Pei J, Xu J, Wang Z, Wang W, Wang K (2007) Maintaining k-anonymity against incremental updates. In: Proceedings of the 19th international conference on scientific and statistical database management (SSBDM ’07), pp article 5
Google Scholar
Puttaswamy KPN, Kruegel C, Zhao BY (2011) Silverline: toward data confidentiality in storage-intensive cloud applications. In: Proceedings of the 2nd ACM symposium on cloud computing (SoCC’11), article 10
Google Scholar
Roy I, Setty STV, Kilzer A, Shmatikov V, Witchel E (2010) Airavat: security and privacy for MapReduce. In: Proceedings of 7th USENIX conference on networked systems design and implementation (NSDI’10), pp 297–312
Google Scholar
Shvachko K, Hairong K, Radia S, Chansler R (2010) The Hadoop distributed file system. In: Proceedings of 2010 IEEE 26th Symposium on mass storage systems and technologies (MSST’10), pp 1–10
Google Scholar
Sweeney L (2002) K-anonymity: a model for protecting privacy. Int J Uncertainty Fuzziness Knowl Based Syst 10(5):557–570
Article MathSciNet MATH Google Scholar
UCI machine learning repository. ftp://ftp.ics.uci.edu/pub/machine-learning-databases/. Accessed on 10 Mar 2013
Wei W, Juan D, Ting Y, Xiaohui G (2009) SecureMR: a service integrity assurance framework for MapReduce. In: Proceedings of annual computer security applications conference (ACSAC ’09), pp 73–82
Google Scholar
Xiao X, Tao Y (2006) Anatomy: simple and effective privacy preservation. In; Proceedings of 32nd international conference on very large data bases (VLDB’06), pp 139–150
Google Scholar
Xiao Z, Xiao Y (2011) Accountable MapReduce in cloud computing. In; Proceedings of the 2011 IEEE conference on computer communications workshops (INFOCOM WKSHPS), pp 1082–1087
Google Scholar
Xu J, Wang W, Pei J, Wang X, Shi B, Fu AWC (2006) Utility-based anonymization using local recoding. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data (KDD’06), pp 785–790
Google Scholar
Yuan D, Yang Y, Liu X, Chen J (2010) A cost-effective strategy for intermediate data storage in scientific cloud corkflow systems. In: Proceedings of the 2010 IEEE international symposium on parallel and distributed processing (IPDPS’10), pp 1–12
Google Scholar
Yuan D, Yang Y, Liu X, Chen J (2011) On-demand minimum cost benchmarking for intermediate dataset storage in scientific cloud workflow systems. J Parallel Distrib Comput 71(2):316–332
Article MATH Google Scholar
Zhang K, Zhou X, Chen Y, Wang X, Ruan Y (2011) Sedic: privacy-aware data intensive computing on hybrid clouds. In: Proceedings of 18th ACM conference on computer and communications, security (CCS’11), pp 515–526
Google Scholar
Zhang X, Liu C, Nepal S, Pandey S, Chen J (2012) A privacy leakage upper-bound constraint based approach for cost-effective privacy preserving of intermediate datasets in cloud. IEEE Trans Parallel Distrib Syst 24(4): 1192–1202
Google Scholar
Zhang X, Liu C, Nepal S, Chen J (2013) An efficient quasi-identifier index based approach for privacy preservation over incremental data sets on cloud. J Comput Syst Sci 79(5):542–555
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Engineering and IT, University of Technology Sydney, Sydney, Australia
Xuyun Zhang, Chang Liu, Chi Yang & Jinjun Chen
ICT Centre, CSIRO, Sydney, Australia
Surya Nepal

Authors

Xuyun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Surya Nepal
View author publications
You can also search for this author in PubMed Google Scholar
Chi Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jinjun Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xuyun Zhang .

Editor information

Editors and Affiliations

CSIRO ICT Centre, Marsfield, New South Wales, Australia
Surya Nepal
Telstra Corporation Limited, Melbourne, Victoria, Australia
Mukaddim Pathan

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhang, X., Liu, C., Nepal, S., Yang, C., Chen, J. (2014). Privacy Preservation over Big Data in Cloud Systems. In: Nepal, S., Pathan, M. (eds) Security, Privacy and Trust in Cloud Systems. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38586-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-38586-5_8
Published: 04 September 2013
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38585-8
Online ISBN: 978-3-642-38586-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Privacy Preservation over Big Data in Cloud Systems

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Clouding Big Data: Information Privacy Considerations

Privacy in cloud computing environments: a survey and research challenges

Privacy in the Cloud

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Privacy Preservation over Big Data in Cloud Systems

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Clouding Big Data: Information Privacy Considerations

Privacy in cloud computing environments: a survey and research challenges

Privacy in the Cloud

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation