Nothing Special   »   [go: up one dir, main page]

Skip to main content

Privacy Preservation over Big Data in Cloud Systems

  • Chapter
  • First Online:
Security, Privacy and Trust in Cloud Systems

Abstract

Cloud computing and Big Data, two disruptive trends at present, pose significant influence on current IT industry and research communities. Cloud computing provides massive computation power and storage capacity which enable users to deploy applications without infrastructure investment

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

eBook
USD 15.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Amazon Web Services (2013) Amazon elastic MapReduce (Amazon EMR). http://aws.amazon.com/elasticmapreduce/. Accessed on 10 Mar 2013

  2. Apache (2013) Apache Mahout machine learning library. http://mahout.apache.org/. Accessed on 10 Mar 2013

  3. Bayardo RJ, Agrawal R (2005) Data privacy through optimal k-anonymization. In: Proceedings of the 21st IEEE international conference on data, engineering (ICDE’05), pp 217–228

    Google Scholar 

  4. Bhatotia P, Wieder A, Rodrigues R, Acar UA, Pasquin R (2011) Incoop: MapReduce for incremental computations. In: Proceedings of the 2nd ACM symposium on cloud, computing (SoCC’11), pp 1–14

    Google Scholar 

  5. Blass E-O, Pietro RD, Molva R, Önen M (2012) PRISM–privacy-preserving search in MapReduce. In: Proceedings of the 12th international conference on privacy enhancing technologies (PETS’12), pp 180–200

    Google Scholar 

  6. Borkar V, Carey MJ, Li C (2012) Inside "Big Data Management": ogres, onions, or parfaits? In: Proceedings of the 15th international conference on extending database technology (EDBT’12), pp 3–14

    Google Scholar 

  7. Bu Y, Howe B, Balazinska M, Ernst MD (2012) The HaLoop approach to large-scale iterative data analysis. VLDB J 21(2):169–190

    Article  Google Scholar 

  8. Cao N, Wang C, Li M, Ren K, Lou W (2011) Privacy-preserving multi-keyword ranked search over encrypted cloud data. In: Proceedings of the 31st annual IEEE international conference on, computer communications (INFOCOM’11), pp 829–837

    Google Scholar 

  9. Chaudhuri S (2012) What next?: A half-dozen data management research goals for big data and the cloud. In: Proceedings of the 31st symposium on principles of database systems (PODS’12), pp 1–4

    Google Scholar 

  10. Davidson SB, Khanna S, Milo T, Panigrahi D, Roy S (2011) Provenance views for module privacy. In: Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems (PODS’11), pp 175–186

    Google Scholar 

  11. Dean J, Ghemawat S (2010) MapReduce: a flexible data processing tool. Commun ACM 53(1):72–77

    Article  Google Scholar 

  12. Doka K, Tsoumakos d, Koziris N (2011) KANIS: preserving k-anonymity over distributed data. In the 5th international workshop on personalized access, profile management, and context awareness in databases (PersDB’11), Seattle, WA, USA

    Google Scholar 

  13. Dwork C (2006) Differential privacy. In: Proceedings of the 33rd international colloquium on automata, languages and programming (ICALP’06), pp 1–12

    Google Scholar 

  14. Ekanayake J, Li H, Zhang B, Gunarathne T, Bae S-H, Qiu J, Fox G (2010) Twister: a runtime for iterative MapReduce. In: Proceedings of the 19th ACM international symposium on high performance, distributed computing (HDPC’10), pp 810–818

    Google Scholar 

  15. Fung BCM, Wang K, Chen R, Yu PS (2010) Privacy-preserving data publishing: a survey of recent developments. ACM Comput Surv 42(4):1–53

    Article  Google Scholar 

  16. Fung BCM, Wang K, Yu PS (2007) Anonymizing classification data for privacy preservation. IEEE Trans Knowl Data Eng 19(5):711–725

    Article  Google Scholar 

  17. Gentry C (2009) Fully homomorphic encryption using ideal lattices. In: Proceedings of the 41st annual ACM symposium on theory of, computing (STOC’09), pp 169–178

    Google Scholar 

  18. Hadoop (2013) http://hadoop.apache.org. Accessed on 10 Mar 2013

  19. Hu H, Xu J, Ren C, Choi B (2011) Processing private queries over untrusted data cloud through privacy homomorphism. In: Proceedings of the IEEE 27th international conference on data, engineering (ICDE’11), pp 601–612

    Google Scholar 

  20. Ko SY, Jeon K, Morales R (2011) The hybrex model for confidentiality and privacy in cloud computing. In: Proceedings of the 3rd USENIX conference on hot topics in cloud computing (HotCloud’11), article 8

    Google Scholar 

  21. KVM (2013) http://www.linux-kvm.org/page/Main_Page. Accessed on 10 Mar 2013

  22. LeFevre K, DeWitt DJ, Ramakrishnan R (2005) Incognito: efficient full-domain k-anonymity. In: Proceedings of 2005 ACM SIGMOD international conference on management of data (SIGMOD ’05), pp 49–60

    Google Scholar 

  23. LeFevre K, DeWitt DJ, Ramakrishnan R (2006) Mondrian multidimensional k-anonymity. In: Proceedings of 22nd international conference on data engineering (ICDE ’06), article 25

    Google Scholar 

  24. Li M, Yu S, Cao N, Lou W (2011) Authorized private keyword search over encrypted data in cloud computing. In: Proceedings of the 31st international conference on distributed, computing systems (ICDCS’11), pp 383–392

    Google Scholar 

  25. Li N, Li T, Venkatasubramanian S (2010) Closeness: a new privacy measure for data publishing. IEEE Trans Knowl Data Eng 22(7):943–956

    Article  Google Scholar 

  26. Machanavajjhala A, Kifer D, Gehrke D, Venkitasubramaniam M (2007) L-diversity: privacy beyond k-anonymity. ACM Trans Knowl Discov Data 1(1): article 3

    Google Scholar 

  27. Mell P, Grance T (2009) The NIST definition of cloud computing (Version 15). National Institute of Standards and Technology, Information Technology Laboratory, U.S.

    Google Scholar 

  28. MicroSoft (2013) Windows Azure. http://www.windowsazure.com/en-us/. Accessed on 10 Mar 2013

  29. Microsoft HealthVault (2013) http://www.microsoft.com/health/ww/products/Pages/healthvault.aspx. Accessed on 10 Mar 2013

  30. Mohammed N, Fung B, Hung PCK, Lee CK (2010) Centralized and distributed anonymization for high-dimensional healthcare data. ACM Trans Knowl Discov Data 4(4): article 18

    Google Scholar 

  31. Muniswamy-Reddy K-K, Macko P, Seltzer M (2010) Provenance for the cloud. In: Proceedings of the 8th USENIX conference on file and storage technologies (FAST’10), pp 197–210

    Google Scholar 

  32. Neuman BC, Ts’o T (1994) Kerberos: an authentication service for computer networks. IEEE Commun Mag 32(9):33–38

    Article  Google Scholar 

  33. OpenStack (2013) http://openstack.org/. Accessed on 10 Mar 2013

  34. Pei J, Xu J, Wang Z, Wang W, Wang K (2007) Maintaining k-anonymity against incremental updates. In: Proceedings of the 19th international conference on scientific and statistical database management (SSBDM ’07), pp article 5

    Google Scholar 

  35. Puttaswamy KPN, Kruegel C, Zhao BY (2011) Silverline: toward data confidentiality in storage-intensive cloud applications. In: Proceedings of the 2nd ACM symposium on cloud computing (SoCC’11), article 10

    Google Scholar 

  36. Roy I, Setty STV, Kilzer A, Shmatikov V, Witchel E (2010) Airavat: security and privacy for MapReduce. In: Proceedings of 7th USENIX conference on networked systems design and implementation (NSDI’10), pp 297–312

    Google Scholar 

  37. Shvachko K, Hairong K, Radia S, Chansler R (2010) The Hadoop distributed file system. In: Proceedings of 2010 IEEE 26th Symposium on mass storage systems and technologies (MSST’10), pp 1–10

    Google Scholar 

  38. Sweeney L (2002) K-anonymity: a model for protecting privacy. Int J Uncertainty Fuzziness Knowl Based Syst 10(5):557–570

    Article  MathSciNet  MATH  Google Scholar 

  39. UCI machine learning repository. ftp://ftp.ics.uci.edu/pub/machine-learning-databases/. Accessed on 10 Mar 2013

  40. Wei W, Juan D, Ting Y, Xiaohui G (2009) SecureMR: a service integrity assurance framework for MapReduce. In: Proceedings of annual computer security applications conference (ACSAC ’09), pp 73–82

    Google Scholar 

  41. Xiao X, Tao Y (2006) Anatomy: simple and effective privacy preservation. In; Proceedings of 32nd international conference on very large data bases (VLDB’06), pp 139–150

    Google Scholar 

  42. Xiao Z, Xiao Y (2011) Accountable MapReduce in cloud computing. In; Proceedings of the 2011 IEEE conference on computer communications workshops (INFOCOM WKSHPS), pp 1082–1087

    Google Scholar 

  43. Xu J, Wang W, Pei J, Wang X, Shi B, Fu AWC (2006) Utility-based anonymization using local recoding. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data (KDD’06), pp 785–790

    Google Scholar 

  44. Yuan D, Yang Y, Liu X, Chen J (2010) A cost-effective strategy for intermediate data storage in scientific cloud corkflow systems. In: Proceedings of the 2010 IEEE international symposium on parallel and distributed processing (IPDPS’10), pp 1–12

    Google Scholar 

  45. Yuan D, Yang Y, Liu X, Chen J (2011) On-demand minimum cost benchmarking for intermediate dataset storage in scientific cloud workflow systems. J Parallel Distrib Comput 71(2):316–332

    Article  MATH  Google Scholar 

  46. Zhang K, Zhou X, Chen Y, Wang X, Ruan Y (2011) Sedic: privacy-aware data intensive computing on hybrid clouds. In: Proceedings of 18th ACM conference on computer and communications, security (CCS’11), pp 515–526

    Google Scholar 

  47. Zhang X, Liu C, Nepal S, Pandey S, Chen J (2012) A privacy leakage upper-bound constraint based approach for cost-effective privacy preserving of intermediate datasets in cloud. IEEE Trans Parallel Distrib Syst 24(4): 1192–1202

    Google Scholar 

  48. Zhang X, Liu C, Nepal S, Chen J (2013) An efficient quasi-identifier index based approach for privacy preservation over incremental data sets on cloud. J Comput Syst Sci 79(5):542–555

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuyun Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Zhang, X., Liu, C., Nepal, S., Yang, C., Chen, J. (2014). Privacy Preservation over Big Data in Cloud Systems. In: Nepal, S., Pathan, M. (eds) Security, Privacy and Trust in Cloud Systems. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38586-5_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38586-5_8

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38585-8

  • Online ISBN: 978-3-642-38586-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics