Abstract
This paper discusses how to study the linear regression model accurately while guaranteeing \(\epsilon \)-differential privacy. The parameters involved in linear regression are sensitive to one single record in database. As a result, a large scale of noise has to be added into the parameters to protect the records in database, which leads to inaccurate results. To improve the accuracy of published results, the existing works enforce \(\epsilon \)-differential privacy by perturbing the coefficients in the objective function(loss function) of one optimization problem, which is constructed to derive parameters of linear regression, rather than adding noise to the parameters directly. And the scale of noise generated in the above technique is proportional to the square of dimensionality. Obviously, if the dimensionality is high, the scale of noise will be very large, i.e., curse of dimensionality. To settle this issue, this paper firstly studies a truncating length in a differential private way, where the length limits the maximal influence of one record on the coefficients of objective function. And then the noisy truncating coefficients are published with the truncating length limitation. Finally, the parameters involved in linear regression can be derived based on the objective function with noisy coefficients. The experiments on real datasets validate the effectiveness of our proposals.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chamikara, M.A.P., Bertók, P., Khalil, I., Liu, D., Camtepe, S.: Privacy preserving face recognition utilizing differential privacy. CoRR abs/2005.10486 (2020)
Chaudhuri, K., Monteleoni, C.: Privacy-preserving logistic regression. In: Proceedings of NIPS, Vancouver, pp. 289–296. Curran Associates, Inc. (2008)
Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. J. Mach. Learn. Res. 12, 1069–1109 (2011)
Cheng, X., Tang, P., Su, S., Chen, R., Wu, Z., Zhu, B.: Multi-party high-dimensional data publishing under differential privacy. IEEE Trans. Knowl. Data Eng. 32(8), 1557–1571 (2020). https://doi.org/10.1109/TKDE.2019.2906610
Dandekar, A., Basu, D., Bressan, S.: Differential privacy for regularised linear regression. In: Hartmann, S., Ma, H., Hameurlain, A., Pernul, G., Wagner, R.R. (eds.) DEXA 2018. LNCS, vol. 11030, pp. 483–491. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98812-2_44
Day, W., Li, N.: Differentially private publishing of high-dimensional data using sensitivity control. In: Proc of Asia’CCS, Singapore, pp. 451–462. ACM (2015). https://doi.org/10.1145/2714576.2714621
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). https://doi.org/10.1007/11681878_14
Ghane, S., Kulik, L., Ramamohanarao, K.: TGM: a generative mechanism for publishing trajectories with differential privacy. IEEE Internet Things J. 7(4), 2611–2621 (2020). https://doi.org/10.1109/JIOT.2019.2943719
Lei, J.: Differentially private m-estimators. In: Proceedings of NIPS, Granada, pp. 361–369 (2011)
McSherry, F.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: Proceedings of SIGMOD, Providence, pp. 19–30. ACM (2009). https://doi.org/10.1145/1559845.1559850
McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: Proceedings of FOCS, Providence, pp. 94–103. IEEE Computer Society (2007). https://doi.org/10.1109/FOCS.2007.41
Murphy, K.P.: Machine Learning - A Probabilistic Perspective. Adaptive Computation and Machine Learning Series. MIT Press, Cambridge (2012)
Rodney, A.: Using linear regression analysis and defense in depth to protect networks during the global corona pandemic. J. Inf. Secur. 11, 261–291 (2020)
Smith, A.D.: Privacy-preserving statistical estimation with optimal convergence rates. In: Proceedings of STOC, San Jose, pp. 813–822. ACM (2011). https://doi.org/10.1145/1993636.1993743
Lafif Tej, M., Holban, S.: Determining optimal multi-layer perceptron structure using linear regression. In: Abramowicz, W., Corchuelo, R. (eds.) BIS 2019. LNBIP, vol. 353, pp. 232–246. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20485-3_18
Wang, D., Xu, J.: On sparse linear regression in the local differential privacy model. In: Proceedings of ICML, Long Beach, vol. 97, pp. 6628–6637. PMLR (2019)
Wang, D., Xu, J.: On sparse linear regression in the local differential privacy model. IEEE Trans. Inf. Theory 67(2), 1182–1200 (2021). https://doi.org/10.1109/TIT.2020.3040406
Wang, Q., Li, Z., Zou, Q., Zhao, L., Wang, S.: Deep domain adaptation with differential privacy. IEEE Trans. Inf. Forensics Secur. 15, 3093–3106 (2020). https://doi.org/10.1109/TIFS.2020.2983254
Wang, Y.: Per-instance differential privacy and the adaptivity of posterior sampling in linear and ridge regression. CoRR abs/1707.07708 (2017)
Wang, Y., Si, C., Wu, X.: Regression model fitting under differential privacy and model inversion attack. In: Proceedings of IJCAI, Buenos Aires, pp. 1003–1009. AAAI Press (2015)
Wu, N., Farokhi, F., Smith, D.B., Kâafar, M.A.: The value of collaboration in convex machine learning with differential privacy. In: Proceedings of SP, San Francisco, pp. 304–317. IEEE (2020). https://doi.org/10.1109/SP40000.2020.00025
Wu, X., Li, F., Kumar, A., Chaudhuri, K., Jha, S., Naughton, J.F.: Bolt-on differential privacy for scalable stochastic gradient descent-based analytics. In: Proceedings of SIGMOD, Chicago, pp. 1307–1322. ACM (2017). https://doi.org/10.1145/3035918.3064047
Xu, J., Zhang, W., Wang, F.: A(dp)\({}^{\text{2}}\)sgd: asynchronous decentralized parallel stochastic gradient descent with differential privacy. CoRR abs/2008.09246 (2020)
Zhang, J., Zhang, Z., Xiao, X., Yang, Y., Winslett, M.: Functional mechanism: Regression analysis under differential privacy. Proc. VLDB Endow. 5(11), 1364–1375 (2012). https://doi.org/10.14778/2350229.2350253
Zhao, B.Z.H., Kâafar, M.A., Kourtellis, N.: Not one but many tradeoffs: privacy vs. utility in differentially private machine learning. CoRR abs/2008.08807 (2020)
Zheng, H., Hu, H., Han, Z.: Preserving user privacy for machine learning: local differential privacy or federated machine learning. IEEE Intell. Syst. 35(4), 5–14 (2020). https://doi.org/10.1109/MIS.2020.3010335
Zou, Y., Bao, X., Xu, C., Ni, W.: Top-k frequent itemsets publication of uncertain data based on differential privacy. In: Wang, G., Lin, X., Hendler, J., Song, W., Xu, Z., Liu, G. (eds.) WISA 2020. LNCS, vol. 12432, pp. 547–558. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60029-7_49
Acknowledgement
This work was supported in part by the National Natural Science Foundation of China under Grant 61902365 and Grant 61902366, in part by the China Postdoctoral Science Foundation under Grant 2019M652473, Grant 2019M652474 and Grant 2020T130623.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, Y. et al. (2021). Differentially Private Linear Regression Analysis via Truncating Technique. In: Xing, C., Fu, X., Zhang, Y., Zhang, G., Borjigin, C. (eds) Web Information Systems and Applications. WISA 2021. Lecture Notes in Computer Science(), vol 12999. Springer, Cham. https://doi.org/10.1007/978-3-030-87571-8_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-87571-8_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87570-1
Online ISBN: 978-3-030-87571-8
eBook Packages: Computer ScienceComputer Science (R0)