Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

An efficient perturbation approach for multivariate data in sensitive and reliable data mining

Published: 01 November 2021 Publication History

Abstract

Due to the rapid enhancement of technology, cloud data is increasing rapidly which contains individuals’ sensitive information such as medical diagnostics reports. While extracting knowledge from those sensitive data, both privacy of individuals’ and the utility of data should be preserved which is a crucial concern in data mining related activities. Though therein exist several methods to preserve privacy, a single method can not maintain the harmony interim privacy and data utility. Often achieving individuals’ privacy leads to the loss of the data utility and the opposite is true also. To address the vital issue, a four-stage data perturbation approach, called NRoReM, is proposed in this work based on normalization, geometric rotation, linear regression, and scalar multiplication for sensitive data mining. The proposed approach is experimented with over ten UCI data set using three benchmark classifiers. The empirical exploration of privacy protection, attack resistance, information entropy analysis, data utility, and error analysis exhibits that NRoReM preserves both privacy of individuals’ and data utility on a larger scale for 90% of the data set than 3-Dimensional Rotation Transformation (3DRT) and 2-Dimensional Rotation Transformation (2DRT).

References

[1]
Chamikara M.A.P., Bertók P., Liu D., Camtepe S., Khalil I., Efficient data perturbation for privacy preserving and accurate data stream mining, Pervasive Mob Comput 48 (2018) 1–19,.
[2]
Askinadze A., Conrad S., Respecting data privacy in educational data mining: An approach to the transparent handling of student data and dealing with the resulting missing value problem, in: 2018 IEEE 27th international conference on enabling technologies: Infrastructure for collaborative enterprises, 2018, pp. 160–164,.
[3]
Denham B., Pears R., Naeem M.A., Enhancing random projection with independent and cumulative additive noise for privacy-preserving data stream mining, Expert Syst Appl 152 (8) (2020) 321–335,.
[4]
Salloum S.A., Alshurideh M., Elnagar A., Shaalan K., Mining in educational data: Review and future directions, in: Joint European-US workshop on applications of invariance in computer vision, Springer, 2020, pp. 92–102.
[5]
Helbing D., Brockmann D., Chadefaux T., Donnay K., Blanke U., Woolley-Meza O., et al., Saving human lives: What complexity science and information systems can contribute, J Stat Phys 158 (3) (2015) 735–781,.
[6]
Jalili M., Perc M., Information cascades in complex networks, J Complex Netw 5 (5) (2017) 665–693,.
[7]
Capraro V., Perc M., Grand challenges in social physics: In pursuit of moral behavior, Front Phys 6 (2018) 107,.
[8]
Wen Y., Liu J., Dou W., Xu X., Cao B., Chen J., Scheduling workflows with privacy protection constraints for big data applications on cloud, Future Gener Comput Syst 108 (2020) 1084–1091,.
[9]
Romero C., Ventura S., Educational data mining and learning analytics: An updated survey, Wiley Interdiscip Rev Data Min Knowl Discov 10 (3) (2020),.
[10]
Chamikara M.A.P., Bertók P., Liu D., Camtepe S., Khalil I., Efficient privacy preservation of big data for accurate data mining, Inform Sci 527 (2020) 420–443,.
[11]
Kreso I., Kapo A., Turulja L., Data mining privacy preserving: Research agenda, Wiley Interdiscip Rev Data Min Knowl Discov 11 (1) (2021),.
[12]
Afrin A., Paul M.K., Sattar A.S., Privacy preserving data mining using non-negative matrix factorization and singular value decomposition, in: 2019 4th international conference on electrical information and communication technology, IEEE, 2019, pp. 1–6,.
[13]
Verykios V.S., Bertino E., Fovino I.N., Provenza L.P., Saygin Y., Theodoridis Y., State-of-the-art in privacy preserving data mining, ACM SIGMOD Rec 33 (1) (2004) 50–57,.
[14]
Malik M.B., Ghazi M.A., Ali R., Privacy preserving data mining techniques: Current scenario and future prospects, in: 2012 third international conference on computer and communication technology, 2012, pp. 26–32,.
[15]
Chen K., Liu L., Geometric data perturbation for privacy preserving outsourced data mining, Knowl Inf Syst 29 (3) (2011) 657–695,.
[16]
Liu K., Kargupta H., Ryan J., Random projection-based multiplicative data perturbation for privacy preserving distributed data mining, IEEE Trans Knowl Data Eng 18 (1) (2005) 92–106,.
[17]
Sattar A.H.M.S., Li J., Liu J., Heatherly R., Malin B., A probabilistic approach to mitigate composition attacks on privacy in non-coordinated environments, Knowl-Based Syst 67 (2014) 361–372,.
[18]
Chen K., Liu L., A random rotation perturbation approach to privacy preserving data classification, Wright State University, 2005.
[19]
Oliveira S.R., Zaiane O.R., Privacy preserving clustering by data transformation, J Inf Data Manag 1 (1) (2010) 37.
[20]
Fang W., Wen X.Z., Zheng Y., Zhou M., A survey of big data security and privacy preserving, IETE Tech Rev 34 (5) (2017) 544–560,.
[21]
Chamikara M.A.P., Bertok P., Khalil I., Liu D., Camtepe S., Privacy preserving distributed machine learning with federated learning, Comput Commun 171 (2021) 112–125,.
[22]
Chang H., Ando H., Privacy-preserving data sharing by integrating perturbed distance matrices, SN Comput Sci 1 (3) (2020) 1–10,.
[23]
Kao Y.-H., Lee W.-B., Hsu T.-Y., Lin C.-Y., Tsai H.-F., Chen T.-S., Data perturbation method based on contrast mapping for reversible privacy-preserving data mining, J Med Biol Eng 35 (6) (2015) 789–794,.
[24]
Shan J., Lin Y., Zhu X., A new range noise perturbation method based on privacy preserving data mining, in: 2020 IEEE international conference on artificial intelligence and information systems, IEEE, 2020, pp. 131–136,.
[25]
Li G., A new Bayesian-based method for privacy-preserving data mining, in: International conference on intelligent and interactive systems and applications, 2017, pp. 171–177,.
[26]
Huang M., Chen Y., Chen B.-W., Liu J., Rho S., Ji W., A semi-supervised privacy-preserving clustering algorithm for healthcare, Peer Peer Netw Appl 9 (5) (2016) 864–875,.
[27]
Kiran A., Vasumathi D., Data mining: Min–max normalization based data perturbation technique for privacy preservation, in: Proceedings of the third international conference on computational intelligence and informatics, Springer, 2020, pp. 723–734.
[28]
Upadhyay S., Sharma C., Sharma P., Bharadwaj P., Seeja K., Privacy preserving data mining with 3-D rotation transformation, J King Saud Univ Comput Inf Sci 30 (4) (2018) 524–530,.
[29]
Oliveira S.R., Zaiane O.R., Data perturbation by rotation for privacy-preserving clustering, Department of Computer Science, University of Alberta, Edmonton, AB, Canada, 2004,.
[30]
Chamikara M.A.P., Bertók P., Liu D., Camtepe S., Khalil I., An efficient and scalable privacy preserving algorithm for big data and data streams, Comput Secur 87 (2019),.
[31]
Lyu L., Bezdek J.C., Law Y.W., He X., Palaniswami M., Privacy-preserving collaborative fuzzy clustering, Data Knowl Eng 116 (2018) 21–41,.
[32]
Shynu P., Shayan H.M., Chowdhary C.L., A fuzzy based data perturbation technique for privacy preserved data mining, in: 2020 international conference on emerging trends in information technology and engineering, IEEE, 2020, pp. 1–4,.
[33]
Lin C.-Y., A reversible privacy-preserving clustering technique based on k-means algorithm, Appl Soft Comput 87 (2020),.
[34]
Shah A., Gulati R., Privacy preserving data mining: Techniques, classification and implications-A survey, Int J Comput Appl 137 (12) (2016).
[35]
Prakash M., Singaravel G., An approach for prevention of privacy breach and information leakage in sensitive data mining, Comput Electr Eng 45 (2015) 134–140,.
[36]
Abitha N., Sarada G., Manikandan G., Sairam N., A cryptographic approach for achieving privacy in data mining, in: 2015 international conference on circuits, power and computing technologies, 2015, pp. 1–5,.
[37]
Zhang N., Zhao W., Privacy-preserving data mining systems, Computer 40 (4) (2007) 52–58,.
[38]
Okkalioglu B.D., Okkalioglu M., Koc M., Polat H., A survey: Deriving private information from perturbed data, Artif Intell Rev 44 (4) (2015) 547–569,.
[39]
Li X., Yan Z., Zhang P., A review on privacy-preserving data mining, in: 2014 IEEE international conference on computer and information technology, 2014, pp. 769–774,.
[40]
Xu S., Zhang J., Han D., Wang J., Singular value decomposition based data distortion strategy for privacy protection, Knowl Inf Syst 10 (3) (2006) 383–397,.
[41]
Tasnim N., Paul M.K., Sattar A.S., Identification of drop out students using educational data mining, in: 2019 international conference on electrical, computer and communication engineering, IEEE, 2019, pp. 1–5,.
[42]
Galar M., Fernandez A., Barrenechea E., Bustince H., Herrera F., A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches, IEEE Trans Syst Man Cybern C 42 (4) (2011) 463–484,.
[43]
Singh D., Singh B., Investigating the impact of data normalization on classification performance, Appl Soft Comput 97 (2020),.
[44]
Gruber D. The mathematics of the 3D rotation matrix. In: Xtreme game developers conference. 2000. p. 1–14.
[45]
Montgomery D.C., Peck E.A., Vining G.G., Introduction to linear regression analysis; vol. 821, 2012.
[46]
Lessmann S., Baesens B., Seow H.-V., Thomas L.C., Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research, European J Oper Res 247 (1) (2015) 124–136,.

Cited By

View all

Index Terms

  1. An efficient perturbation approach for multivariate data in sensitive and reliable data mining
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Journal of Information Security and Applications
        Journal of Information Security and Applications  Volume 62, Issue C
        Nov 2021
        144 pages

        Publisher

        Elsevier Science Inc.

        United States

        Publication History

        Published: 01 November 2021

        Author Tags

        1. Privacy preserving data mining
        2. Data perturbation
        3. Data privacy
        4. Information privacy
        5. Privacy
        6. Data utility

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 25 Nov 2024

        Other Metrics

        Citations

        Cited By

        View all

        View Options

        View options

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media