Nothing Special   »   [go: up one dir, main page]

Skip to main content

Advertisement

Log in

Prescriptive analytics with differential privacy

  • Regular Paper
  • Published:
International Journal of Data Science and Analytics Aims and scope Submit manuscript

Abstract

Prescriptive analytics is a mechanism that provides the best set of actions to be taken to prevent undesirable outcomes for a given instance. However, this mechanism is prone to privacy breaches if an adversary with subsidiary data is allowed multiple query access to it. So, we propose a differential privacy mechanism in prescriptive analytics to preserve data privacy. Differential privacy can be achieved with the help of sensitivity of the given actions. Roughly speaking, sensitivity is the maximum change in the given set of actions with respect to the change in the given instances. However, a general analytical form for the sensitivity of the prescriptive analytics mechanism is difficult to derive. So, we formulate a nested constrained optimization to solve the problem. We use synthetic data in the experiments to validate the behavior of the differential privacy mechanism with respect to different privacy parameter settings. The experiments with two real-world datasets—Student Academic Performance and Reddit dataset, demonstrate the usefulness of our proposed method in education and social policy design. We also propose a new evaluation measure called the prescription success rate to further investigate the significance of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. http://time.com/5205314/facebook-cambridge-analytica-breach/.

  2. https://www.reddit.com/r/stopdrinking/.

  3. https://www.kaggle.com/rmalshe/student-performance-prediction/data.

References

  1. Achenbach, A., Spinler, S.: Prescriptive analytics in airline operations: arrival time prediction and cost index optimization for short-haul flights. Oper. Res. Perspect. 5, 265–279 (2018)

    Google Scholar 

  2. Aggarwal, C.C., Chen, C., Han, J.: The inverse classification problem. J. Comput. Sci. Technol. 25(3), 458–468 (2010)

    Article  Google Scholar 

  3. Agrawal, R., Srikant, R.: Privacy-preserving data mining. ACM Sigmod Rec. 29, 439–450 (2000)

    Article  Google Scholar 

  4. Anderson, R.N.: ‘Petroleum analytics learning machine’ for optimizing the Internet of Things of today’s digital oil field-to-refinery petroleum system. In: 2017 IEEE International Conference on Big Data (Big Data), pp. 4542–4545. IEEE (2017)

  5. Barak, B., Chaudhuri, K., Dwork, C., Kale, S., McSherry, F., Talwar, K.: Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In: Proceedings of the Twenty-Sixth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 273–282 (2007)

  6. Basu, A.: Five pillars of prescriptive analytics success. Anal. Mag. 8–12 (2013)

  7. Basu, K., Ghosh, S.: Analysis of Thompson sampling for Gaussian process optimization in the bandit setting. arXiv preprint arXiv:1705.06808 (2017)

  8. Baur, A., Klein, R., Steinhardt, C.: Model-based decision support for optimal brochure pricing: applying advanced analytics in the tour operating industry. OR Spectr. 36(3), 557–584 (2014)

    Article  MathSciNet  Google Scholar 

  9. Bertsimas, D., Kallus, N.: From predictive to prescriptive analytics. arXiv:1402.5481 (2014)

  10. Brochu, E., Cora, V.M., De Freitas, N.: A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599 (2010)

  11. Chaudhuri, K., Mishra, N.: When random sampling preserves privacy. In: Annual International Cryptology Conference, pp. 198–213. Springer (2006)

  12. Chaudhuri, K., Monteleoni, C.: Privacy-preserving logistic regression. In: Proceedings of Advances in Neural Information Processing Systems, pp. 289–296 (2009)

  13. Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. J. Mach. Learn. Res. 12(3), 1069–1109 (2011)

    MathSciNet  MATH  Google Scholar 

  14. Chen, H., Fu, C., Zhao, J., Koushanfar, F.: DeepInspect: a Black-box Trojan detection and mitigation framework for deep neural networks. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, pp. 4658–4664 (2019)

  15. Chi, C.L., Street, W.N., Robinson, J.G., Crawford, M.A.: Individualized patient-centered lifestyle recommendations: an expert system for communicating patient specific cardiovascular risk information and prioritizing lifestyle options. J. Biomed. Inform. 45(6), 1164–1174 (2012)

    Article  Google Scholar 

  16. Chin, F.Y., Ozsoyoglu, G.: Auditing and inference control in statistical databases. IEEE Trans. Softw. Eng. (6), 574–582 (1982)

  17. Davenport, T.H., et al.: Competing on analytics. Harv. Bus. Rev. 84(1), 98 (2006)

    Google Scholar 

  18. den Hertog, D., Postek, K.: Bridging the Gap Between Predictive and Prescriptive Analytics—New Optimization Methodology Needed. Tilburg University, Tilburg (2016)

    Google Scholar 

  19. Dinur, I., Nissim, K.: Revealing information while preserving privacy. In: Proceedings of the Twenty-Second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 202–210 (2003)

  20. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Theory of Cryptography Conference, vol. 3876, pp. 265–284 (2006)

  21. Dwork, C., Nissim, K.: Privacy-preserving datamining on vertically partitioned databases. In: Annual International Cryptology Conference, pp. 528–544 (2004)

  22. Friedman, A., Schuster, A.: Data mining with differential privacy. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 493–502 (2010)

  23. Ganta, S.R., Kasiviswanathan, S.P., Smith, A.: Composition attacks and auxiliary information in data privacy. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 265–273 (2008)

  24. Gelbart, M.A., Snoek, J., Adams, R.P.: Bayesian optimization with unknown constraints. arXiv:1403.5607 (2014)

  25. Goyal, A., Aprilia, E., Janssen, G., Kim, Y., Kumar, T., Mueller, R., Phan, D., Raman, A., Schuddebeurs, J., Xiong, J., et al.: Asset health management using predictive and prescriptive analytics for the electric power grid. IBM J. Res. Dev. 60(1), 1–4 (2016)

    Article  Google Scholar 

  26. Gröger, C., Schwarz, H., Mitschang, B.: Prescriptive analytics for recommendation-based business process optimization. In: Proceedings of International Conference on Business Information Systems, pp. 25–37 (2014)

  27. Gupta, A., Ligett, K., McSherry, F., Roth, A., Talwar, K.: Differentially private approximation algorithms. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, vol. 1, p. 2 (2010)

  28. Ha, H., Rana, S., Gupta, S., Nguyen, T., Tran-The, H., Venkatesh, S.: Bayesian optimization with unknown search space. In: Proceedings of Advances in Neural Information Processing Systems (2019)

  29. Hagerty, J.: Planning Guide for Data and Analytics, p. 13. Gartner Inc, Stamford (2017)

    Google Scholar 

  30. Harikumar, H., Le, V., Rana, S., Bhattacharya, S., Gupta, S., Venkatesh, S.: Scalable backdoor detection in neural networks. arXiv preprint arXiv:2006.05646 (2020)

  31. Harikumar, H., Nguyen, T., Gupta, S., Rana, S., Kaimal, R., Venkatesh, S.: Understanding behavioral differences between short and long-term drinking abstainers from social media. In: Proceedings of International Conference on Advanced Data Mining and Applications, pp. 520–533 (2016)

  32. Harikumar, H., Nguyen, T., Rana, S., Gupta, S., Kaimal, R., Venkatesh, S.: Extracting key challenges in achieving sobriety through shared subspace learning. In: Proceedings of International Conference on Advanced Data Mining and Applications, pp. 420–433 (2016)

  33. Harikumar, H., Rana, S., Gupta, S., Nguyen, T., Kaimal, R., Venkatesh, S.: Differentially private prescriptive analytics. In: IEEE International Conference on Data Mining, pp. 995–1000 (2018)

  34. Harikumar, H., Rana, S., Gupta, S., Nguyen, T., Kaimal, R., Venkatesh, S.: Prescriptive analytics through constrained Bayesian optimization. In: Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 335–347 (2018)

  35. Hong, S., Shin, S., Kim, Y.M., Seon, C.N., Ho Um, J., Song, S.K.: Design of marketing scenario planning based on business big data analysis. In: International Conference on HCI in Business, pp. 585–592. Springer (2015)

  36. Ilyas, A., Engstrom, L., Athalye, A., Lin, J.: Black-box adversarial attacks with limited queries and information. arXiv preprint arXiv:1804.08598 (2018)

  37. Kusner, M., Gardner, J., Garnett, R., Weinberger, K.: Differentially private Bayesian optimization. In: International Conference on Machine Learning, pp. 918–927 (2015)

  38. Lash, M.T., Lin, Q., Street, N., Robinson, J.G., Ohlmann, J.: Generalized inverse classification. In: Proceedings of the SIAM International Conference on Data Mining, pp. 162–170 (2017)

  39. Lash, M.T., Lin, Q., Street, W.N., Robinson, J.G.: A budget-constrained inverse classification framework for smooth classifiers. In: IEEE International Conference on Data Mining Workshops, pp. 1184–1193 (2017)

  40. Li, X., Zhao, H., Yu, D., Wang, L.E., Liu, P. : Multidimensional correlation hierarchical differential privacy for medical data with multiple privacy requirements. In: International Conference on Healthcare Science and Engineering, pp. 153–173 (2018)

  41. Li, Y., Li, L., Wang, L., Zhang, T., Gong, B.: NATTACK: learning the distributions of adversarial examples for an improved Black-box attack on deep neural networks. arXiv preprint arXiv:1905.00441 (2019)

  42. Liu, F.: Generalized Gaussian mechanism for differential privacy. IEEE Trans. Knowl. Data Eng. 31(4), 747–756 (2019)

    Article  Google Scholar 

  43. McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: 48th Annual IEEE Symposium on Foundations of Computer Science, pp. 94–103 (2007)

  44. Mockus, J.: On Bayesian methods for seeking the extremum and their application. In: Proceedings of the Optimization Techniques IFIP Technical Conference, pp. 400–404 (1975)

  45. Mockus, J.: Application of Bayesian approach to numerical methods of global and stochastic optimization. J. Glob. Optim. 347–365 (1994)

  46. Narayanan, A., Shmatikov, V.: Robust de-anonymization of large sparse datasets. In: IEEE Symposium on Security and Privacy, pp. 111–125 (2008)

  47. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  48. Pennebaker, J.W., Booth, R.J., Boyd, R.L., Francis, M.E.: Linguistic Inquiry and Word Count: LIWC 2015 [Computer Software]. Pennebaker Conglomerates, Inc., Austin (2015)

    Google Scholar 

  49. Rana, S., Gupta, S.K., Venkatesh, S.: Differentially private random forest with high utility. In: IEEE International Conference on Data Mining, pp. 955–960 (2015)

  50. Ren, X., Yu, C.M., Yu, W., Yang, S., Yang, X., McCann, J.A., Philip, S.Y.: LoPub: high-dimensional crowdsourced data publication with local differential privacy. IEEE Trans. Inf. Forensics Secur. 13(9), 2151–2166 (2018)

    Article  Google Scholar 

  51. Rubinstein, B.I., Bartlett, P.L., Huang, L., Taft, N.: Learning in a large function space: privacy-preserving mechanisms for SVM learning. arXiv:0911.5708 (2009)

  52. Russo, D., Van Roy, B., Kazerouni, A., Osband, I., Wen, Z.: A tutorial on Thompson sampling. arXiv preprint arXiv:1707.02038 (2017)

  53. Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., De Freitas, N.: Taking the human out of the loop: a review of Bayesian optimization. Proc. IEEE 104(1), 148–175 (2015)

    Article  Google Scholar 

  54. Liu, L., Özsu, M.: Encyclopedia of Database Systems, vol. 6, Springer New York, NY, USA (2009)

  55. Smith, M., Álvarez, M., Zwiessele, M., Lawrence, N.D.: Differentially private regression with Gaussian processes. In: International Conference on Artificial Intelligence and Statistics, pp. 1195–1203 (2018)

  56. Smith, M.T., Zwiessele, M., Lawrence, N.D.: Differentially private Gaussian processes. arXiv:1606.00720 (2016)

  57. Song, S., Chaudhuri, K., Sarwate, A.D.: Stochastic gradient descent with differentially private updates. In: Global Conference on Signal and Information Processing, pp. 245–248 (2013)

  58. Song, S.K., Jeong, D.H., Kim, J., Hwang, M., Gim, J., Jung, H.: Research advising system based on prescriptive analytics. In: Future Information Technology. Springer, pp. 569–574 (2014)

  59. Srinivas, N., Krause, A., Kakade, S., Seeger, M.: Gaussian process optimization in the bandit setting: no regret and experimental design. In: In International Conference on Machine Learning, pp. 1015–1022 (2010)

  60. Sweeney, L.: k-Anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(05), 557–570 (2002)

    Article  MathSciNet  Google Scholar 

  61. Thompson, W.R.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3/4), 285–294 (1933)

    Article  Google Scholar 

  62. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  63. Traub, J.F., Yemini, Y., Woźniakowski, H.: The statistical security of a statistical database. ACM Trans. Database Syst. 9(4), 672–679 (1984)

    Article  Google Scholar 

  64. Wang, B., Yao, Y., Shan, S., Li, H., Viswanath, B., Zheng, H., Zhao, B.Y.: Neural cleanse: identifying and mitigating backdoor attacks in neural networks. In: IEEE Symposium on Security and Privacy, pp. 707–723 (2019)

  65. Wu, P.J., Yang, C.K.: The green fleet optimization model for a low-carbon economy: a prescriptive analytics. In: 2017 International Conference on Applied System Innovation (ICASI), pp. 107–110. IEEE (2017)

  66. Yang, C., Street, N.W., Robinson, J.G.: 10-year CVD risk prediction and minimization via inverse classification. In: Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, pp. 603–610 (2012)

Download references

Acknowledgements

This research was partially funded by the Australian Government through the Australian Research Council (ARC). Professor Venkatesh is the recipient of an ARC Australian Laureate Fellowship (FL170100006).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haripriya Harikumar.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Harikumar, H., Rana, S., Gupta, S. et al. Prescriptive analytics with differential privacy. Int J Data Sci Anal 13, 123–138 (2022). https://doi.org/10.1007/s41060-021-00286-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41060-021-00286-w

Keywords

Navigation