Abstract
Cloud computing provides the users with the ability to outsource their data to a third-party cloud storage for cost-effective management of resources and on-demand network access. However, outsourcing the data to a third-party location may raise concerns about data privacy. To maintain the user’s privacy, users tend to encrypt their sensitive data before outsourcing it. Encrypting the data will preserve its privacy, but at the same time, it makes the searching process for a specific keyword a time-consuming and challenging process, mainly if the encryption key is not provided. On the other hand, the data owner should be able to perform multiple keyword searches to retrieve specific documents that are relevant to the search query. This paper proposes a new privacy-preserving multi-keyword search approach for the cloud outsourced data. The objective of the proposed approach is to allow the data owners and the authorized users to retrieve the most relevant data with minimum computation and communication overhead, and reduced false positives (irrelevant documents) and searching time. To evaluate the proposed approach, the NSF research dataset is used. Results demonstrate that the proposed method achieves better searching time and overall performance of the cloud environment regarding computation and communication overhead as well as false positives in comparison with other approaches.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Adjedj M, Bringer J, Chabanne H, Kindarji B (2009) Biometric identification over encrypted data made feasible. In: Information systems security. Springer, Berlin, pp 86–100
Alguliev R, Aliguliyev R (2007) Experimental investigating the F-measure as similarity measure for automatic text summarization. Appl Comput Math 6:278–287
Aljammal AH, Manasrah AM, Abdallah AE, Tahat NM (2017) A new architecture of cloud computing to enhance the load balancing. Int J Bus Inf Syst 25:393–405
Bache K, Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml
Bellare M, Boldyreva A, O’Neill A (2007) Deterministic and efficiently searchable encryption. In: Advances in cryptology—CRYPTO 2007. Springer, Berlin, pp 535–552
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
Boneh D, Di Crescenzo G, Ostrovsky R, Persiano G (2004) Public key encryption with keyword search. In: Advances in cryptology—Eurocrypt 2004. Springer, Berlin, pp 506–522
Cao N, Wang C, Li M, Ren K, Lou W (2014) Privacy-preserving multi-keyword ranked search over encrypted cloud data. IEEE Trans Parallel Distrib Syst 25:222–233
Chang Y-C, Mitzenmacher M (2005) Privacy preserving keyword searches on remote encrypted data. In: Applied cryptography and network security. Springer, Berlin, pp 442–455
Chase M, Kamara S (2010) Structured encryption and controlled disclosure. In: Advances in cryptology—ASIACRYPT 2010. Springer, Berlin, pp 577–594
Chaves R, Kuzmanov G, Sousa L, Vassiliadis S (2006) Rescheduling for optimized SHA-1 calculation. In: International workshop on embedded computer systems. Springer, Berlin, pp 425–434
Chen C, Zhu X, Shen P, Hu J, Guo S, Tari Z, Zomaya AY (2016) An efficient privacy-preserving ranked keyword search method. IEEE Trans Parallel Distrib Syst 27:951–963
Curtmola R, Garay J, Kamara S, Ostrovsky R (2006) Searchable symmetric encryption: improved definitions and efficient constructions. In: Proceedings of the 13th ACM conference on computer and communications security. ACM, pp 79–88
Gupta B, Agrawal DP, Yamaguchi S (2016) Handbook of research on modern cryptographic solutions for computer and cyber security. IGI Global, Hershey
Handa R, Challa RK (2015) A cluster based multi-keyword search on outsourced encrypted cloud data. In: 2015 2nd International conference on computing for sustainable global development (INDIACom). IEEE, pp 115–120
Jiang ZL et al (2018) Efficient two-party privacy preserving collaborative k-means clustering protocol supporting both storage and computation outsourcing. In: Vaidya J, Li J (eds) Algorithms and architectures for parallel processing. Springer International Publishing, Cham, pp 447–460
Katz J, Sahai A, Waters B (2008) Predicate encryption supporting disjunctions, polynomial equations, and inner products. Advances in cryptology—EUROCRYPT 2008. Springer, Berlin, pp 146–162
Krishna CR, Handa R (2016) Dynamic cluster based privacy-preserving multi-keyword search over encrypted cloud data. In: 2016 6th International conference on cloud system and big data engineering (Confluence). IEEE, pp 146–151
Li P, Li J, Huang Z, Gao C-Z, Chen W-B, Chen K (2017) Privacy-preserving outsourced classification in cloud computing. Cluster Comput 21:1–10
Manasrah AM (2017) Dynamic weighted VM load balancing for cloud-analyst. Int J Inf Comput Secur 9:5–19
Manasrah AM, Al-Din BN (2016) Mapping private keys into one public key using binary matrices and masonic cipher: Caesar cipher as a case study. Secur Commun Netw 9:1450–1461
Manasrah AM, Gupta B (2017) An optimized service broker routing policy based on differential evolution algorithm in fog/cloud environment. Cluster Comput. https://doi.org/10.1007/s10586-017-1559-z
Manasrah AM, Smadi T, ALmomani A (2016) A variable service broker routing policy for data center selection in cloud analyst. J King Saud Univ Comput Inf Sci 29:365–377
Nagwani N (2015) Summarizing large text collection using topic modeling and clustering based on MapReduce framework. J Big Data 2:6
Nedjah N, Wyant RS, Mourelle L, Gupta B (2017) Efficient yet robust biometric iris matching on smart cards for data high security and privacy. Future Gener Comput Syst 76:18–32
Pasupuleti SK, Ramalingam S, Buyya R (2016) An efficient and secure privacy-preserving approach for outsourced data of resource constrained mobile devices in cloud computing. J Netw Comput Appl 64:12–22
Plageras AP, Psannis KE, Stergiou C, Wang H, Gupta BB (2018) Efficient IoT-based sensor BIG Data collection—processing and analysis in smart buildings. Future Gener Comput Syst 82:349–357
Porter MF, Boulton R, Macfarlane A (2002) The english (porter2) stemming algorithm. Retrieved 18 2011
Ramasubramanian C, Ramya R (2013) Effective pre-processing activities in text mining using improved porter’s stemming algorithm. Int J Adv Res Comput Commun Eng 2:4536–4538
Ramos J (2003) Using tf-idf to determine word relevance in document queries. In: Proceedings of the first instructional conference on machine learning, pp 133–142
Ranjan KA, Pasupulati SK, Ramaligam S (2017) Privacy-preserving multi-keyword search over the encrypted data for multiple users in cloud computing. In: International conference on inventive computing and informatics (ICICI). IEEE, pp 1079–1084
Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24:513–523
Salton G, Wong A, Yang C-S (1975) A vector space model for automatic indexing. Commun ACM 18:613–620
Sarkar K (2012) Bengali text summarization by sentence extraction. arXiv preprint arXiv:12012240
Stergiou C, Psannis KE, Kim B-G, Gupta B (2018) Secure integration of IoT and cloud computing. Future Gener Comput Syst 78:964–975
Takabi H (2014) Privacy aware access control for data sharing in cloud computing environments. In: Proceedings of the 2nd international workshop on security in cloud computing. ACM, pp 27–34
Wang J, Ma H, Tang Q, Li J, Zhu H, Ma S, Chen X (2012) A new efficient verifiable fuzzy keyword search scheme. JoWUA 3:61–71
Witten IH, Frank E, Hall MA, Pal CJ (2016) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, Los Altos
Xiangyang Z, Hua D, Xun Y, Geng Y, Xiao L (2017) MUSE: an efficient and accurate verifiable privacy-preserving multikeyword text search over encrypted cloud data. Secur Comm Netw. https://doi.org/10.1155/2017/1923476
Xu Y, Cui W, Peinado M (2015) Controlled-channel attacks: Deterministic side channels for untrusted operating systems. In: 2015 IEEE symposium on security and privacy (SP). IEEE, pp 640–656
Yin H, Zhang J, Xiong Y, Huang X, Deng T (2018) PPK-means: achieving privacy-preserving clustering over encrypted multi-dimensional cloud data. Electronics 7:310
Zhao W, Ma H, He Q (2009) Parallel k-means clustering based on mapreduce. IEEE International Conference on Cloud Computing. Springer, Berlin, pp 674–679
Zhu X, Chen C, Tian X, Hu J (2015) HCSF: a hierarchical clustering algorithm based on swarm intelligence and fuzzy logic for ciphertext search. In: 2015 IEEE 10th Conference on industrial electronics and applications (ICIEA). IEEE, pp 290–295
Acknowledgements
We are thankful for the people who gave technical assistance in getting the related IR information and concepts.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by B. B. Gupta.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Manasrah, A.M., Abu Nasir, M. & Salem, M. A privacy-preserving multi-keyword search approach in cloud computing. Soft Comput 24, 5609–5631 (2020). https://doi.org/10.1007/s00500-019-04033-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-04033-z