Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2487575.2487601acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Mining frequent graph patterns with differential privacy

Published: 11 August 2013 Publication History

Abstract

Discovering frequent graph patterns in a graph database offers valuable information in a variety of applications. However, if the graph dataset contains sensitive data of individuals such as mobile phone-call graphs and web-click graphs, releasing discovered frequent patterns may present a threat to the privacy of individuals. Differential privacy has recently emerged as the de facto standard for private data analysis due to its provable privacy guarantee. In this paper we propose the first differentially private algorithm for mining frequent graph patterns.
We first show that previous techniques on differentially private discovery of frequent itemsets cannot apply in mining frequent graph patterns due to the inherent complexity of handling structural information in graphs. We then address this challenge by proposing a Markov Chain Monte Carlo (MCMC) sampling based algorithm. Unlike previous work on frequent itemset mining, our techniques do not rely on the output of a non-private mining algorithm. Instead, we observe that both frequent graph pattern mining and the guarantee of differential privacy can be unified into an MCMC sampling framework. In addition, we establish the privacy and utility guarantee of our algorithm and propose an efficient neighboring pattern counting technique as well. Experimental results show that the proposed algorithm is able to output frequent patterns with good precision.

References

[1]
C. Aggarwal and S. Philip. Privacy-preserving data mining: models and algorithms. 2008.
[2]
M. Al Hasan and M. J. Zaki. Output space sampling for graph patterns. Proc. VLDB, 2(1):730--741, 2009.
[3]
R. Bhaskar, S. Laxman, A. Smith, and A. Thakurta. Discovering frequent patterns in sensitive data. In KDD, pages 503--512, 2010.
[4]
J. Cheng, Y. Ke, and W. Ng. Graphgen: A graph synthetic generator. http://www.cse.ust.hk/graphgen, 2006.
[5]
L. Cordella, P. Foggia, C. Sansone, and M. Vento. A (sub) graph isomorphism algorithm for matching large graphs. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 26(10), 2004.
[6]
G. Cormode, C. Procopiuc, E. Shen, D. Srivastava, and T. Yu. Differentially Private Spatial Decompositions. ICDE, 2012.
[7]
B. Ding, M. Winslett, and J. Han. Differentially private data cubes: optimizing noise sources and consistency. SIGMOD, 2011.
[8]
C. Dwork, K. Kenthapadi, and F. McSherry. Our data, ourselves: Privacy via distributed noise generation. Advances in Cryptology, 2006.
[9]
C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. Theory of Cryptography, pages 265--284, 2006.
[10]
J. Geweke. Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In Bayesian Statistics, 1992.
[11]
W. Gilks, S. Richardson, and D. Spiegelhalter. Markov chain Monte Carlo in practice. Chapman & Hall/CRC, 1996.
[12]
M. Gjoka, M. Kurant, C. Butts, and A. Markopoulou. Walking in facebook: A case study of unbiased sampling of osns. In INFOCOM, pages 1--9, 2010.
[13]
M. Hay, C. Li, G. Miklau, and D. Jensen. Accurate Estimation of the Degree Distribution of Private Networks. ICDE, pages 169--178, Dec. 2009.
[14]
S.-S. Ho and S. Ruan. Differential privacy for location pattern mining. In SIGSPATIAL Workshop on Security and Privacy in GIS and LBS, pages 17--24. ACM, 2011.
[15]
A. Inokuchi, T. Washio, and H. Motoda. An apriori-based algorithm for mining frequent substructures from graph data. Principles of Data Mining and Knowledge Discovery, pages 13--23, 2000.
[16]
V. Karwa, S. Raskhodnikova, and A. Smith. Private Analysis of Graph Structure. Proceedings of the VLDB, 4(11):1146--1157, 2011.
[17]
N. Li, W. Qardaji, D. Su, and J. Cao. Privbasis: frequent itemset mining with differential privacy. VLDB Endow., 5(11), July 2012.
[18]
A. Machanavajjhala, A. Korolova, and A. Sarma. Personalized social recommendations-accurate or private? VLDB, 4(7), 2011.
[19]
F. McSherry and I. Mironov. Differentially Private Recommender Systems: Building Privacy into the Netflix Prize Contenders. In KDD, pages 627--636, 2009.
[20]
F. McSherry and K. Talwar. Mechanism Design via Differential Privacy. FOCS, 2007.
[21]
D. Mehmood, B. Shafiq, J. Vaidya, Y. Hong, N. Adam, and V. Atluri. Privacy-preserving subgraph discovery. In Data and Applications Security and Privacy XXVI, pages 161--176. Springer, 2012.
[22]
N. Mohammed, R. Chen, B. C. M. Fung, and P. S. Yu. Differentially Private Data Release for Data Mining. KDD, 2011.
[23]
A. Nanavati, S. Gurumurthy, G. Das, D. Chakraborty, K. Dasgupta, S. Mukherjea, and A. Joshi. On the structural properties of massive telecom call graphs: findings and implications. In Proceedings of CIKM, pages 435--444. ACM, 2006.
[24]
R. Rubinstein and D. Kroese. Simulation and the Monte Carlo method. Wiley, 2008.
[25]
A. Sala, X. Zhao, C. Wilson, H. Zheng, and B. Y. Zhao. Sharing graphs using differentially private graph models. In Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference, pages 81--98. ACM, 2011.
[26]
E. Shen and T. Yu. Mining frequent graph patterns with differential privacy. http://arxiv.org/abs/1301.7015.
[27]
O. Williams and F. McSherry. Probabilistic inference and differential privacy. In Neural Information Processing Systems (NIPS), 2010.
[28]
X. Xiao, G. Wang, and J. Gehrke. Differential privacy via wavelet transforms. IEEE Transactions on Knowledge and Data Engineering, pages 1200--1214, 2010.
[29]
X. Yan and J. Han. gspan: Graph-based substructure pattern mining. In ICDM, 2002.
[30]
M. Zaki. Efficiently mining frequent trees in a forest: Algorithms and applications. Knowledge and Data Engineering, IEEE Transactions on, 17(8):1021--1035, 2005.

Cited By

View all
  • (2024)Privacy-Preserved Neural Graph DatabasesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671678(1108-1118)Online publication date: 25-Aug-2024
  • (2024)Publishing Common Neighbors Histograms of Social Networks under Edge Differential PrivacyProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3637646(1099-1113)Online publication date: 1-Jul-2024
  • (2024)Shortest Paths Publishing With Differential PrivacyIEEE Transactions on Sustainable Computing10.1109/TSUSC.2023.33299959:2(209-221)Online publication date: Mar-2024
  • Show More Cited By

Index Terms

  1. Mining frequent graph patterns with differential privacy

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2013
      1534 pages
      ISBN:9781450321747
      DOI:10.1145/2487575
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 11 August 2013

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. differential privacy
      2. graph pattern mining

      Qualifiers

      • Research-article

      Conference

      KDD' 13
      Sponsor:

      Acceptance Rates

      KDD '13 Paper Acceptance Rate 125 of 726 submissions, 17%;
      Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)63
      • Downloads (Last 6 weeks)12
      Reflects downloads up to 17 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Privacy-Preserved Neural Graph DatabasesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671678(1108-1118)Online publication date: 25-Aug-2024
      • (2024)Publishing Common Neighbors Histograms of Social Networks under Edge Differential PrivacyProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3637646(1099-1113)Online publication date: 1-Jul-2024
      • (2024)Shortest Paths Publishing With Differential PrivacyIEEE Transactions on Sustainable Computing10.1109/TSUSC.2023.33299959:2(209-221)Online publication date: Mar-2024
      • (2024)Node and Edge Differential Privacy for Graph Laplacian Spectra: Mechanisms and Scaling LawsIEEE Transactions on Network Science and Engineering10.1109/TNSE.2023.332937911:2(1690-1701)Online publication date: Mar-2024
      • (2024)Accurately Estimating Frequencies of Relations With Relation Privacy Preserving in Decentralized NetworksIEEE Transactions on Mobile Computing10.1109/TMC.2023.3320669(1-15)Online publication date: 2024
      • (2024)Privacy-Enhanced Frequent Sequence Mining and Retrieval for Personalized Behavior PredictionIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.339192819(4957-4969)Online publication date: 2024
      • (2024)Toward Accurate Butterfly Counting with Edge Privacy Preserving in Bipartite NetworksIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621436(2289-2298)Online publication date: 20-May-2024
      • (2024)Locally differentially private graph learning on decentralized social graphKnowledge-Based Systems10.1016/j.knosys.2024.112488304(112488)Online publication date: Nov-2024
      • (2024)Secure shortest distance queries over encrypted graph in cloud computingWireless Networks10.1007/s11276-024-03692-730:4(2633-2646)Online publication date: 29-Feb-2024
      • (2024)IEA-DP: Information Entropy-driven Adaptive Differential Privacy Protection Scheme for social networksThe Journal of Supercomputing10.1007/s11227-024-06202-w80:14(20546-20582)Online publication date: 3-Jun-2024
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media