Abstract
Third party data analysis raises data privacy preservation concerns, therefore raising questions as to whether such outsourcing is viable. Cryptography allows a level of data confidentiality. Although some cryptography algorithms, such as Homomorphic Encryption (HE), allow a limited amount of data manipulation, the disadvantage is that encryption precludes any form of sophisticated analysis. For this to be achieved the encrypted data needs to coupled with additional information to facilitate third party analysis. This paper proposes a mechanism for secure k-means clustering that uses HE and the concept of an Updatable Distance Matrix (UDM). The mechanism is fully described and analysed. The reported evaluation shows that the proposed mechanism produces identical clustering results as when “standard” k-means is applied, but in a secure manner. The proposed mechanism thus allows the application of clustering algorithms to encrypted data while preserving both correctness and data privacy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
A “trapdoor” in this context is a function that can be simply computed on one direction but is difficult to compute in the reverse direction without additional knowledge; the concept is widely used in cryptography.
References
Agrawal, R., Srikant, R.: Privacy-preserving data mining. SIGMOD Rec. 29(2), 439–450 (2000)
Berinato, S.: There’s no such thing as anonymous data. Harvard Business Review, February 2015
Chen, T., Chen, J., Zhou, B.: A system for parallel data mining service on cloud. In: 2012 Second International Conference on Cloud and Green Computing, pp. 329–330, November 2012
Chhinkaniwala, H., Garg, S.: Privacy preserving data mining techniques: Challenges and issues. In: Proceedings of International Conference on Computer Science and Information Technology, CSlT, pp. 609, July 2011
Erkin, Z., Veugen, T., Toft, T., Lagendijk, R.L.: Privacy-preserving user clustering in a social network. In: 2009 First IEEE International Workshop on Information Forensics and Security (WIFS), pp. 96–100. IEEE, December 2009
Goldreich, O.: Foundations of Cryptography: Volume 2, Basic Applications. Cambridge University Press, New York (2009)
Jha, S., Kruger, L., McDaniel, P.: Privacy preserving clustering. In: Vimercati, S.C., Syverson, P., Gollmann, D. (eds.) ESORICS 2005. LNCS, vol. 3679, pp. 397–417. Springer, Heidelberg (2005). doi:10.1007/11555827_23
Lichman, M.: UCI machine learning repository (2013)
Lindell, Y., Pinkas, B.: Privacy preserving data mining. J. Cryptol. 15(3), 177–206 (2002)
Liu, D.: Homomorphic encryption for database querying, December 2013
Liu, D., Bertino, E., Yi, X.: Privacy of outsourced k-means clustering. In: Proceedings of the 9th ACM Symposium on Information, Computer and Communications Security, pp. 123–134. ACM, June 2014
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, pp. 281–297, June 1967
Mittal, D., Kaur, D., Aggarwal, A.: Secure data mining in cloud using homomorphic encryption. In: 2014 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM), pp. 1–7. IEEE, October 2014
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)
Singh, M.D., Krishna, P.R., Saxena, A.: A privacy preserving jaccard similarity function for mining encrypted data. In: TENCON 2009–2009 IEEE Region 10 Conference, pp. 1–4. IEEE, January 2009
Vaidya, J., Clifton, C.W., Zhu, Y.M.: Privacy Preserving Data Mining, vol. 19. Springer, Heidelberg (2006)
Yang, Y., Maode, M.A.: Semantic searchable encryption scheme based on lattice in quantum-era. J. Inf. Sci. Eng. 32(2), 425–438 (2016)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Almutairi, N., Coenen, F., Dures, K. (2017). K-Means Clustering Using Homomorphic Encryption and an Updatable Distance Matrix: Secure Third Party Data Clustering with Limited Data Owner Interaction. In: Bellatreche, L., Chakravarthy, S. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2017. Lecture Notes in Computer Science(), vol 10440. Springer, Cham. https://doi.org/10.1007/978-3-319-64283-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-64283-3_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64282-6
Online ISBN: 978-3-319-64283-3
eBook Packages: Computer ScienceComputer Science (R0)