Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Privacy-Preserving Computation of Bayesian Networks on Vertically Partitioned Data

Published: 01 September 2006 Publication History

Abstract

Traditionally, many data mining techniques have been designed in the centralized model in which all data is collected and available in one central site. However, as more and more activities are carried out using computers and computer networks, the amount of potentially sensitive data stored by business, governments, and other parties increases. Different parties often wish to benefit from cooperative use of their data, but privacy regulations and other privacy concerns may prevent the parties from sharing their data. Privacy-preserving data mining provides a solution by creating distributed data mining algorithms in which the underlying data need not be revealed. In this paper, we present privacy-preserving protocols for a particular data mining task: learning a Bayesian network from a database vertically partitioned among two parties. In this setting, two parties owning confidential databases wish to learn the Bayesian network on the combination of their databases without revealing anything else about their data to each other. We present an efficient and privacy-preserving protocol to construct a Bayesian network on the parties' joint data.

References

[1]
D. Agrawal and C. Aggarwal, “On the Design and Quantification of Privacy Preserving Data Mining Algorithms,” Proc. 20th ACM SIGMOD-SIGACT-SIGART Symp. Principles of Database Systems, pp. 247-255, 2001.
[2]
R. Agrawal, A. Evfimievski, and R. Srikant, “Information Sharing across Private Databases,” Proc. 2003 ACM SIGMOD Int'l Conf. Management of Data, pp. 86-97, 2003.
[3]
Rakesh Agrawal, Ramakrishnan Srikant, Privacy-preserving data mining, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.439-450, May 15-18, 2000, Dallas, Texas, United States
[4]
M. Atallah and W. Du, “Secure Multi-Party Computational Geometry,” Proc. Seventh Int'l Workshop Algorithms and Data Structures, pp. 165-179, 2001.
[5]
J. Cryptology, vol. 13, no. 1, pp. 143-202, 2000.
[6]
R. Canetti, Y. Ishai, R. Kumar, M. Reiter, R. Rubinfeld, and R.N. Wright, “Selective Private Function Evaluation with Applications to Private Statistics,” Proc. 20th Ann. ACM Symp. Principles of Distributed Computing, pp. 293-304, 2001.
[7]
J. Canny, “Collaborative Filtering with Privacy,” Proc. 2002 IEEE Symp. Security and Privacy, pp. 45-57, 2002.
[8]
R. Chen, K. Sivakumar, and H. Kargupta, “Learning Bayesian Network Structure from Distributed Data,” Proc. SIAM Int'l Data Mining Conf., pp. 284-288, 2003.
[9]
Knowledge Information Syststems, vol. 6, no. 2, pp. 164-187, 2004.
[10]
D.M. Chickering, “Learning Bayesian Networks is NP-Complete,” Learning from Data: Artificial Intelligence and Statistics V, pp. 121-130, 1996.
[11]
G. Cooper and E. Herskovits, “A Bayesian Method for the Induction of Probabilistic Networks from Data,” Machine Learning, vol. 9, no. 4, pp. 309-347, 1992.
[12]
J. Daemen and V. Rijmen, The Design of Rijndael: AES— The Advanced Encryption Standard. Springer-Verlag, 2002.
[13]
V. Estivill-Castro and L. Brankovic, “Balancing Privacy against Precision in Mining for Logic Rules,” Proc. First Int'l Data Warehousing and Knowledge Discovery, pp. 389-398, 1999.
[14]
Alexandre Evfimievski, Ramakrishnan Srikant, Rakesh Agrawal, Johannes Gehrke, Privacy preserving mining of association rules, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, July 23-26, 2002, Edmonton, Alberta, Canada
[15]
M. Freedman, K. Nissim, and B. Pinkas, “Efficient Private Matching and Set Intersection,” Advances in Cryptology— Proc. EUROCRYPT 2004, pp. 1-19, Springer-Verlag, 2004.
[16]
B. Goethals, S. Laur, H. Lipmaa, and T. Mielikainen, “On Private Scalar Product Computation for Privacy-Preserving Data Mining,” Information Security and Cryptology— Proc. ICISC, vol. 3506, pp. 104-120, 2004.
[17]
O. Goldreich, S. Micali, and A. Wigderson, “How to Play ANY Mental Game,” Proc. 19th Ann. ACM Conf. Theory of Computing, pp. 218-229, 1987.
[18]
O. Goldreich, Foundations of Cryptography, Volume II: Basic Applications. Cambridge Univ. Press, 2004.
[19]
The Health Insurance Portability and Accountability Act of 1996,
[20]
G. Jagannathan, K. Pillaipakkamnatt, and R.N. Wright, “A New Privacy-Preserving Distributed $k{\hbox{-}}{\rm{Clustering}}$ Algorithm,” Proc. Sixth SIAM Int'l Conf. Data Mining, 2006.
[21]
G. Jagannathan and R.N. Wright, “Privacy-Preserving Distributed $k{\hbox{-}}{\rm{Means}}$ Clustering over Arbitrarily Partitioned Data,” Proc. 11th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 593-599, 2005.
[22]
E. Johnson and H. Kargupta, “Collective, Hierarchical Clustering from Distributed, Heterogeneous Data,” Lecture Notes in Computer Science, vol. 1759, pp. 221-244, 1999.
[23]
M. Kantarcioglu and C. Clifton, “Privacy-Preserving Distributed Mining of Association Rules on Horizontally Partitioned Data,” Proc. ACM SIGMOD Workshop Research Issues on Data Mining and Knowledge Discovery (DMKD '02), pp. 24-31, June 2002.
[24]
M. Kantarcioglu and J. Vaidya, “Privacy Preserving Naive Bayes Classifier for Horizontally Partitioned Data,” Proc. IEEE Workshop Privacy Preserving Data Mining, 2003.
[25]
M. Kantarcioglu, J. Jin, and C. Clifton, “When Do Data Mining Results Violate Privacy?” Proc. 10th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 599-604, 2004.
[26]
O. Kardes, R.S. Ryger, R.N. Wright, and J. Feigenbaum, “Implementing Privacy-Preserving Bayesian-Net Discovery for Vertically Partitioned Data,” Proc. Int'l Conf. Data Mining Workshop Privacy and Security Aspects of Data Mining, 2005.
[27]
H. Kargupta, S. Datta, Q. Wang, and K. Sivakumar, “On the Privacy Preserving Properties of Random Data Perturbation Techniques,” Proc. Third IEEE Int'l Conf. Data Mining, pp. 99-106, 2003.
[28]
H. Kargupta, B. Park, D. Hershberger, and E. Johnson, “Collective Data Mining: A New Perspective towards Distributed Data Mining,” Advances in Distributed and Parallel Knowledge Discovery, AAAI/MIT Press, 2000.
[29]
J. Cryptology, vol. 15, no. 3, pp. 177-206, 2002.
[30]
K. Liu, H. Kargupta, and J. Ryan, “Multiplicative Noise, Random Projection, and Privacy Preserving Data Mining from Distributed Multi-Party Data,” Technical Report TR-CS-03-24, Computer Science and Electrical Eng. Dept., Univ. of Maryland, Baltimore County, 2003.
[31]
D. Malkhi, N. Nisan, B. Pinkas, and Y. Sella, “Fairplay— A Secure Two-Party Computation System,” Proc. 13th Usenix Security Symp., pp. 287-302, 2004.
[32]
D. Meng, K. Sivakumar, and H. Kargupta, “Privacy-Sensitive Bayesian Network Parameter Learning,” Proc. Fourth IEEE Int'l Conf. Data Mining, pp. 487-490, 2004.
[33]
IEEE Expert, vol. 10, no. 2, pp. 48-52, 1995.
[34]
P. Paillier, “Public-Key Cryptosystems Based on Composite Degree Residue Classes,” Advances in Cryptography— Proc. EUROCRYPT '99, pp. 223-238, 1999.
[35]
European Parliament, “Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the Protection of Individuals with Regard to the Processing of Personal Data and on the Free Movement of Such Data,” Official J. European Communities, p. 31 1995.
[36]
European Parliament, “Directive 97/66/EC of the European Parliament and of the Council of 15 December 1997 Concerning the Processing of Personal Data and the Protection of Privacy in the Telecommunications Sector,” Official J. European Communities, pp. 1-8, 1998.
[37]
S. Rizvi and J. Haritsa, “Maintaining Data Privacy in Association Rule Mining,” Proc. 28th Very Large Data Bases Conf., pp. 682-693, 2002.
[38]
S. Stolfo, A. Prodromidis, S. Tselepis, W. Lee, D. Fan, and P. Chan, “JAM: Java Agents for Meta-Learning over Distributed Databases,” Knowledge Discovery and Data Mining, pp. 74-81, 1997.
[39]
H. Subramaniam, R.N. Wright, and Z. Yang, “Experimental Analysis of Privacy-Preserving Statistics Computation,” Proc. Very Large Data Bases Worshop Secure Data Management, pp. 55-66, Aug. 2004.
[40]
Jaideep Vaidya, Chris Clifton, Privacy preserving association rule mining in vertically partitioned data, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, July 23-26, 2002, Edmonton, Alberta, Canada
[41]
J. Vaidya and C. Clifton, “Privacy-Preserving k-Means Clustering over Vertically Partitioned Data,” Proc. Ninth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 206-215, 2003.
[42]
J. Vaidya and C. Clifton, “Privacy Preserving Naive Bayes Classifier on Vertically Partitioned Data,” Proc. 2004 SIAM Int'l Conf. Data Mining, 2004.
[43]
R.N. Wright and Z. Yang, “Privacy-Preserving Bayesian Network Structure Computation on Distributed Heterogeneous Data,” Proc. 10th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 713-718, 2004.
[44]
Information and Computation, vol. 150, no. 1, pp. 22-56, 1999.
[45]
Z. Yang and R.N. Wright, “Improved Privacy-Preserving Bayesian Network Parameter Learning on Vertically Partitioned Data,” Proc. Int'l Conf. Data Eng. Int'l Workshop Privacy Data Management, Apr. 2005.
[46]
A. Yao, “How to Generate and Exchange Secrets,” Proc. 27th IEEE Symp. Foundations of Computer Science, pp. 162-167, 1986.

Cited By

View all
  • (2020)DOS-GAN: A Distributed Over-Sampling Method Based on Generative Adversarial Networks for Distributed Class-Imbalance LearningAlgorithms and Architectures for Parallel Processing10.1007/978-3-030-60248-2_42(609-622)Online publication date: 2-Oct-2020
  • (2016)Rough sets in distributed decision information systemsKnowledge-Based Systems10.1016/j.knosys.2015.10.02594:C(13-22)Online publication date: 15-Feb-2016
  • (2014)Privacy-Preserving Kriging Interpolation on Distributed DataProceedings of the 14th International Conference on Computational Science and Its Applications — ICCSA 2014 - Volume 858410.1007/978-3-319-09153-2_52(695-708)Online publication date: 30-Jun-2014
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Knowledge and Data Engineering  Volume 18, Issue 9
September 2006
144 pages

Publisher

IEEE Educational Activities Department

United States

Publication History

Published: 01 September 2006

Author Tags

  1. Bayesian networks
  2. Data privacy
  3. privacy-preserving data mining.

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2020)DOS-GAN: A Distributed Over-Sampling Method Based on Generative Adversarial Networks for Distributed Class-Imbalance LearningAlgorithms and Architectures for Parallel Processing10.1007/978-3-030-60248-2_42(609-622)Online publication date: 2-Oct-2020
  • (2016)Rough sets in distributed decision information systemsKnowledge-Based Systems10.1016/j.knosys.2015.10.02594:C(13-22)Online publication date: 15-Feb-2016
  • (2014)Privacy-Preserving Kriging Interpolation on Distributed DataProceedings of the 14th International Conference on Computational Science and Its Applications — ICCSA 2014 - Volume 858410.1007/978-3-319-09153-2_52(695-708)Online publication date: 30-Jun-2014
  • (2011)Secure construction and publication of contingency tables from distributed dataJournal of Computer Security10.5555/2011016.201101819:3(453-484)Online publication date: 1-Aug-2011
  • (2011)Privacy-preserving approach to bayesian network structure learning from distributed dataProceedings of the 13th annual conference companion on Genetic and evolutionary computation10.1145/2001858.2002103(815-816)Online publication date: 12-Jul-2011
  • (2010)A distributed and privacy-preserving method for network intrusion detectionProceedings of the 2010 international conference on On the move to meaningful internet systems: Part II10.5555/1926129.1926148(861-875)Online publication date: 25-Oct-2010
  • (2010)Communication-Efficient Privacy-Preserving ClusteringTransactions on Data Privacy10.5555/1747335.17473363:1(1-25)Online publication date: 1-Apr-2010
  • (2009)Impossibility of unconditionally secure scalar productsData & Knowledge Engineering10.1016/j.datak.2009.04.00668:10(1059-1070)Online publication date: 1-Oct-2009
  • (2008)Privacy-preserving data mining in the malicious modelInternational Journal of Information and Computer Security10.1504/IJICS.2008.0224882:4(353-375)Online publication date: 1-Jan-2008
  • (2007)Towards privacy-preserving model selectionProceedings of the 1st ACM SIGKDD international conference on Privacy, security, and trust in KDD10.5555/1793474.1793484(138-152)Online publication date: 12-Aug-2007

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media