Abstract
In this paper we introduce the concept of privacy preserving data mining. In our model, two parties owning confidential databases wish to run a data mining algorithm on the union of their databases, without revealing any unnecessary information. This problem has many practical and important applications, such as in medical research with confidential patient records.
Data mining algorithms are usually complex, especially as the size of the input is measured in megabytes, if not gigabytes. A generic secure multi-party computation solution, based on evaluation of a circuit computing the algorithm on the entire input, is therefore of no practical use. We focus on the problem of decision tree learning and use ID3, a popular and widely used algorithm for this problem. We present a solution that is considerably more efficient than generic solutions. It demands very few rounds of communication and reasonable bandwidth. In our solution, each party performs by itself a computation of the same order as computing the ID3 algorithm for its own database. The results are then combined using efficient cryptographic protocols, whose overhead is only logarithmic in the number of transactions in the databases. We feel that our result is a substantial contribution, demonstrating that secure multi-party computation can be made practical, even for complex problems and large inputs.
Supported by an Eshkol grant of the Israel Ministry of Science.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
M. Bellare and S. Micali, Non-interactive oblivious transfer and applications, Advances in Cryptology-Cryptord’ 89, pp. 547–557, 1990.
M. Ben-Or, S. Goldwasser and A. Wigderson, Completeness theorems for non cryptographic fault tolerant distributed computation, 20th STOC, (1988), 1–9.
D. Boneh and M. Franklin, Efficient generation of shared RSA keys, Proc. of Crypto’ 97, LNCS, Vol. 1233, Springer-Verlag, pp. 425–439, 1997.
R. Canetti, Security and Composition of Multi-party Cryptographic Protocols. To appear in the Journal of Cryptology. Available from the Theory of Cryptography Library at http://philby.ucsd.edu/cryptlib, 1998.
D. Chaum, C. Crepeau and I. Damgard, Multiparty unconditionally secure protocols, 20th Proc. ACM Symp. on Theory of Computing, (1988), 11–19.
B. Chor, O. Goldreich, E. Kushilevitz and M. Sudan, Private Information Retrieval, 36th FOCS, pp. 41–50, 1995.
R. Cramer, N. Gilboa. M. Naor, B. Pinkas and G. Poupard, Oblivious Polynomial Evaluation, 2000.
S. Even, O. Goldreich and A. Lempel, A Randomized Protocol for Signing Contracts, Communications of the ACM 28, pp. 637–647, 1985.
R. Fagin, M. Naor and P. Winkler, Comparing Information Without Leaking It, Communications of the ACM, vol 39, May 1996, pp. 77–85.
J. Feigenbaum, J. Fong, M. Strauss and R. N. Wright, Secure Multiparty Computation of Approximations, manuscript, 2000.
N. Gilboa, Two Party RSA Key Generation, Proc of Crypto’ 99, Lecture Notes in Computer Science, Vol. 1666, Springer-Verlag, pp. 116–129, 1999.
O. Goldreich, Secure Multi-Party Computation, 1998. (Available at http://philby.ucsd.edu)
O. Goldreich, S. Micali and A. Wigderson, How to Play any Mental Game-A Completeness Theorem for Protocols with Honest Majority. In 19th ACM Symposium on the Theory of Computing, pp. 218–229, 1987.
J. Kilian, Uses of randomness in algorithms and protocols, MIT Press, 1990.
T. Mitchell, Machine Learning. McGraw Hill, 1997.
M. Naor and B. Pinkas, Oblivious Transfer and Polynomial Evaluation, Proc. of the 31st STOC, Atlanta, GA, pp. 245–254, May 1–4, 1999.
M. Naor and B. Pinkas, Efficient Oblivious Transfer Protocols, manuscript, 2000.
P. Paillier, Public-Key Cryptosystems Based on Composite Degree Residuocity Classes. Proc. of Eurocrypt’ 99, LNCS Vol. 1592, pp. 223–238, 1999.
J. Ross Quinlan, Induction of Decision Trees. Machine Learning 1(1): 81–106 (1986)
M. O. Rabin, How to exchange secrets by oblivious transfer, Tech. Memo TR-81, Aiken Computation Laboratory, 1981.
A.C. Yao, How to generate and exchange secrets, Proc. of the 27th IEEE Symp. on Foundations of Computer Science, 1986, pp. 162–167.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lindell, Y., Pinkas, B. (2000). Privacy Preserving Data Mining. In: Bellare, M. (eds) Advances in Cryptology — CRYPTO 2000. CRYPTO 2000. Lecture Notes in Computer Science, vol 1880. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44598-6_3
Download citation
DOI: https://doi.org/10.1007/3-540-44598-6_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67907-3
Online ISBN: 978-3-540-44598-2
eBook Packages: Springer Book Archive