Nothing Special   »   [go: up one dir, main page]

Skip to main content

K-Means Clustering Using Homomorphic Encryption and an Updatable Distance Matrix: Secure Third Party Data Clustering with Limited Data Owner Interaction

  • Conference paper
  • First Online:
Big Data Analytics and Knowledge Discovery (DaWaK 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10440))

Included in the following conference series:

Abstract

Third party data analysis raises data privacy preservation concerns, therefore raising questions as to whether such outsourcing is viable. Cryptography allows a level of data confidentiality. Although some cryptography algorithms, such as Homomorphic Encryption (HE), allow a limited amount of data manipulation, the disadvantage is that encryption precludes any form of sophisticated analysis. For this to be achieved the encrypted data needs to coupled with additional information to facilitate third party analysis. This paper proposes a mechanism for secure k-means clustering that uses HE and the concept of an Updatable Distance Matrix (UDM). The mechanism is fully described and analysed. The reported evaluation shows that the proposed mechanism produces identical clustering results as when “standard” k-means is applied, but in a secure manner. The proposed mechanism thus allows the application of clustering algorithms to encrypted data while preserving both correctness and data privacy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    A “trapdoor” in this context is a function that can be simply computed on one direction but is difficult to compute in the reverse direction without additional knowledge; the concept is widely used in cryptography.

References

  1. Agrawal, R., Srikant, R.: Privacy-preserving data mining. SIGMOD Rec. 29(2), 439–450 (2000)

    Article  Google Scholar 

  2. Berinato, S.: There’s no such thing as anonymous data. Harvard Business Review, February 2015

    Google Scholar 

  3. Chen, T., Chen, J., Zhou, B.: A system for parallel data mining service on cloud. In: 2012 Second International Conference on Cloud and Green Computing, pp. 329–330, November 2012

    Google Scholar 

  4. Chhinkaniwala, H., Garg, S.: Privacy preserving data mining techniques: Challenges and issues. In: Proceedings of International Conference on Computer Science and Information Technology, CSlT, pp. 609, July 2011

    Google Scholar 

  5. Erkin, Z., Veugen, T., Toft, T., Lagendijk, R.L.: Privacy-preserving user clustering in a social network. In: 2009 First IEEE International Workshop on Information Forensics and Security (WIFS), pp. 96–100. IEEE, December 2009

    Google Scholar 

  6. Goldreich, O.: Foundations of Cryptography: Volume 2, Basic Applications. Cambridge University Press, New York (2009)

    MATH  Google Scholar 

  7. Jha, S., Kruger, L., McDaniel, P.: Privacy preserving clustering. In: Vimercati, S.C., Syverson, P., Gollmann, D. (eds.) ESORICS 2005. LNCS, vol. 3679, pp. 397–417. Springer, Heidelberg (2005). doi:10.1007/11555827_23

    Chapter  Google Scholar 

  8. Lichman, M.: UCI machine learning repository (2013)

    Google Scholar 

  9. Lindell, Y., Pinkas, B.: Privacy preserving data mining. J. Cryptol. 15(3), 177–206 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  10. Liu, D.: Homomorphic encryption for database querying, December 2013

    Google Scholar 

  11. Liu, D., Bertino, E., Yi, X.: Privacy of outsourced k-means clustering. In: Proceedings of the 9th ACM Symposium on Information, Computer and Communications Security, pp. 123–134. ACM, June 2014

    Google Scholar 

  12. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, pp. 281–297, June 1967

    Google Scholar 

  13. Mittal, D., Kaur, D., Aggarwal, A.: Secure data mining in cloud using homomorphic encryption. In: 2014 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM), pp. 1–7. IEEE, October 2014

    Google Scholar 

  14. Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)

    Article  MATH  Google Scholar 

  15. Singh, M.D., Krishna, P.R., Saxena, A.: A privacy preserving jaccard similarity function for mining encrypted data. In: TENCON 2009–2009 IEEE Region 10 Conference, pp. 1–4. IEEE, January 2009

    Google Scholar 

  16. Vaidya, J., Clifton, C.W., Zhu, Y.M.: Privacy Preserving Data Mining, vol. 19. Springer, Heidelberg (2006)

    MATH  Google Scholar 

  17. Yang, Y., Maode, M.A.: Semantic searchable encryption scheme based on lattice in quantum-era. J. Inf. Sci. Eng. 32(2), 425–438 (2016)

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Nawal Almutairi , Frans Coenen or Keith Dures .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Almutairi, N., Coenen, F., Dures, K. (2017). K-Means Clustering Using Homomorphic Encryption and an Updatable Distance Matrix: Secure Third Party Data Clustering with Limited Data Owner Interaction. In: Bellatreche, L., Chakravarthy, S. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2017. Lecture Notes in Computer Science(), vol 10440. Springer, Cham. https://doi.org/10.1007/978-3-319-64283-3_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-64283-3_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-64282-6

  • Online ISBN: 978-3-319-64283-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics