Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3133956.3134030acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Open access

Composing Differential Privacy and Secure Computation: A Case Study on Scaling Private Record Linkage

Published: 30 October 2017 Publication History

Abstract

Private record linkage (PRL) is the problem of identifying pairs of records that are similar as per an input matching rule from databases held by two parties that do not trust one another. We identify three key desiderata that a PRL solution must ensure: (1) perfect precision and high recall of matching pairs, (2) a proof of end-to-end privacy, and (3) communication and computational costs that scale subquadratically in the number of input records. We show that all of the existing solutions for PRL? including secure 2-party computation (S2PC), and their variants that use non-private or differentially private (DP) blocking to ensure subquadratic cost -- violate at least one of the three desiderata. In particular, S2PC techniques guarantee end-to-end privacy but have either low recall or quadratic cost. In contrast, no end-to-end privacy guarantee has been formalized for solutions that achieve subquadratic cost. This is true even for solutions that compose DP and S2PC: DP does not permit the release of any exact information about the databases, while S2PC algorithms for PRL allow the release of matching records.
In light of this deficiency, we propose a novel privacy model, called output constrained differential privacy, that shares the strong privacy protection of DP, but allows for the truthful release of the output of a certain function applied to the data. We apply this to PRL, and show that protocols satisfying this privacy model permit the disclosure of the true matching records, but their execution is insensitive to the presence or absence of a single non-matching record. We find that prior work that combine DP and S2PC techniques even fail to satisfy this end-to-end privacy model. Hence, we develop novel protocols that provably achieve this end-to-end privacy guarantee, together with the other two desiderata of PRL. Our empirical evaluation also shows that our protocols obtain high recall, scale near linearly in the size of the input databases and the output set of matching pairs, and have communication and computational costs that are at least 2 orders of magnitude smaller than S2PC baselines.

Supplemental Material

MP4 File

References

[1]
Ali Al-Lawati, Dongwon Lee, and Patrick McDaniel. 2005. Blocking-aware Private Record Linkage. In IQIS.
[2]
Dima Alhadidi, Noman Mohammed, Benjamin C. M. Fung, and Mourad Debbabi 2012. Secure Distributed Framework for Achieving ε-differential Privacy PETS.
[3]
Mikhail J. Atallah, Florian Kerschbaum, and Wenliang Du. 2003. Secure and Private Sequence Comparisons. In WPES.
[4]
Jeremiah Blocki, Avrim Blum, Anupam Datta, and Or Sheffet. 2012. The Johnson-Lindenstrauss Transform Itself Preserves Differential Privacy FOCS.
[5]
Jianneng Cao, Fang-Yu Rao, Elisa Bertino, and Murat Kantarcioglu 2015. A hybrid private record linkage scheme: Separating differentially private synopses from matching records. In ICDE.
[6]
Peter Christen. 2012. Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Springer Publishing Company, Incorporated.
[7]
Tim Churches and Peter Christen 2004. Blind data linkage using n-gram similarity comparisons KDD.
[8]
Tim Churches and Peter Christen 2004. Some methods for blindfolded record linkage. BMC Medical Informatics and Decision Making, Vol. 4, 1 (2004), 1.
[9]
Xin Luna Dong and Divesh Srivastava 2013. Big Data Integration. VLDB (2013).
[10]
Cynthia Dwork. 2006. Differential Privacy ICALP 2006.
[11]
Cynthia Dwork and Aaron Roth 2014. The Algorithmic Foundations of Differential Privacy. Found. Trends Theor. Comput. Sci. (2014).
[12]
Michael J. Freedman, Kobbi Nissim, and Benny Pinkas. 2004. Efficient Private Matching and Set Intersection. EUROCRYPT.
[13]
Lise Getoor and Ashwin Machanavajjhala 2013. Entity Resolution for Big Data. In KDD.
[14]
Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 1999. Similarity Search in High Dimensions via Hashing. VLDB.
[15]
Oded Goldreich. 2004. Foundations of Cryptography: Volume 2, Basic Applications. Cambridge University Press, New York, NY, USA.
[16]
Slawomir Goryczka, Li Xiong, and Vaidy Sunderam. 2013. Secure Multiparty Aggregation with Differential Privacy: A Comparative Study EDBT.
[17]
Xi He, Ashwin Machanavajjhala, and Bolin Ding. 2014. Blowfish Privacy: Tuning Privacy-utility Trade-offs Using Policies SIGMOD.
[18]
Ali Inan, Murat Kantarcioglu, Elisa Bertino, and Monica Scannapieco 2008. A Hybrid Approach to Private Record Linkage. In ICDE.
[19]
Ali Inan, Murat Kantarcioglu, Gabriel Ghinita, and Elisa Bertino 2010. Private Record Matching Using Differential Privacy EDBT.
[20]
Alexandros Karakasidis and Vassilios S. Verykios 2009. Privacy Preserving Record Linkage Using Phonetic Codes BCI.
[21]
Dimitrios Karapiperis and Vassilios S. Verykios 2015. An LSH-Based Blocking Approach with a Homomorphic Matching Technique for Privacy-Preserving Record Linkage. TKDE (2015).
[22]
Michael Kearns, Aaron Roth, Zhiwei Steven Wu, and Grigory Yaroslavtsev 2016. Private algorithms for the protected in social network search. Proceedings of the National Academy of Sciences (2016).
[23]
Hanna Köpcke, Andreas Thor, and Erhard Rahm 2010. Benchmark datasets for entity resolution. http://dbs.uni-leipzig.de/en/research/projects/object_matching/fever/benchmark_datasets_for_entity_resolution. (2010).
[24]
Mehmet Kuzu, Murat Kantarcioglu, Ali Inan, Elisa Bertino, Elizabeth Durham, and Bradley Malin. 2013. Efficient Privacy-aware Record Integration. In EDBT.
[25]
Ilya Mironov, Omkant Pandey, Omer Reingold, and Salil Vadhan 2009. Computational Differential Privacy. In CRYPTO.
[26]
Noman Mohammed, Dima Alhadidi, Benjamin C. M. Fung, and Mourad Debbabi 2014. Secure Two-Party Differentially Private Data Release for Vertically Partitioned Data. IEEE Transactions on Dependable and Secure Computing (2014).
[27]
Arjun Narayan and Andreas Haeberlen 2012. DJoin: Differentially Private Join Queries over Distributed Databases OSDI.
[28]
Pascal Paillier. 1999. Advances in Cryptology. Springer Berlin Heidelberg, Chapter Public-Key Cryptosystems Based on Composite Degree Residuosity Classes.
[29]
Chaoyi Pang, Lifang Gu, David Hansen, and Anthony Maeder. 2009. Privacy-Preserving Fuzzy Matching Using a Public Reference Table.
[30]
Manas Pathak, Shantanu Rane, and Bhiksha Raj 2010. Multiparty Differential Privacy via Aggregation of Locally Trained Classifiers. Advances in Neural Information Processing Systems 23, J. Lafferty, C. Williams, J. Shawe-taylor, R.s. Zemel, and A. Culotta (Eds.). http://books.nips.cc/papers/files/nips23/NIPS2010_0408.pdf
[31]
Martin Pettai and Peeter Laud 2015. Combining Differential Privacy and Secure Multiparty Computation ACSAC.
[32]
Benny Pinkas, Thomas Schneider, and Michael Zohner. 2016. Scalable Private Set Intersection Based on OT Extension. Cryptology ePrint Archive, Report 2016/930. (2016). shownotehttp://eprint.iacr.org/2016/930.
[33]
Pradeep Ravikumar and Stephen E. Fienberg 2004. A secure protocol for computing string distance metrics In PSDM held at ICDM. 40--46.
[34]
Monica Scannapieco, Ilya Figotin, Elisa Bertino, and Ahmed K. Elmagarmid 2007. Privacy Preserving Schema and Data Matching. In SIGMOD.
[35]
Rainer Schnell, Tobias Bachteler, and Jörg Reiher. 2009. Privacy-preserving record linkage using Bloom filters. BMC Medical Informatics and Decision Making, Vol. 9, 1 (2009), 41.
[36]
Adam Smith. 2015. The Privacy of Secured Computations. In Crypto & Big Data Workshop.
[37]
NYC Taxi and Limousine Commission 2013. TLC Trip Record Data. http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml. (2013).
[38]
Brian Thorne. 2016. Python-paillie. https://readthedocs.org/projects/python-paillier/. (2016).
[39]
Sameer Wagh, Paul Cuff, and Prateek Mittal 2016. Root ORAM: A Tunable Differentially Private Oblivious RAM. CoRR Vol. abs/1601.03378 (2016).
[40]
Andrew Chi-Chih Yao. 1986. How to Generate and Exchange Secrets. In SFCS.
[41]
Qingsong Ye, Ron Steinfeld, Josef Pieprzyk, and Huaxiong Wang 2009. Efficient Fuzzy Matching and Intersection on Private Datasets ICISC.

Cited By

View all
  • (2024)Use Cases Requiring Privacy-Preserving Record Linkage in Paediatric OncologyCancers10.3390/cancers1615269616:15(2696)Online publication date: 29-Jul-2024
  • (2024)SecretFlow-SCQL: A Secure Collaborative Query PlatformProceedings of the VLDB Endowment10.14778/3685800.368582117:12(3987-4000)Online publication date: 1-Aug-2024
  • (2024)Scenario-based Adaptations of Differential Privacy: A Technical SurveyACM Computing Surveys10.1145/365115356:8(1-39)Online publication date: 26-Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security
October 2017
2682 pages
ISBN:9781450349468
DOI:10.1145/3133956
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. differential privacy
  2. private record linkage
  3. smc

Qualifiers

  • Research-article

Funding Sources

Conference

CCS '17
Sponsor:

Acceptance Rates

CCS '17 Paper Acceptance Rate 151 of 836 submissions, 18%;
Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)260
  • Downloads (Last 6 weeks)29
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Use Cases Requiring Privacy-Preserving Record Linkage in Paediatric OncologyCancers10.3390/cancers1615269616:15(2696)Online publication date: 29-Jul-2024
  • (2024)SecretFlow-SCQL: A Secure Collaborative Query PlatformProceedings of the VLDB Endowment10.14778/3685800.368582117:12(3987-4000)Online publication date: 1-Aug-2024
  • (2024)Scenario-based Adaptations of Differential Privacy: A Technical SurveyACM Computing Surveys10.1145/365115356:8(1-39)Online publication date: 26-Apr-2024
  • (2024)FedKNN: Secure Federated k-Nearest Neighbor SearchProceedings of the ACM on Management of Data10.1145/36392662:1(1-26)Online publication date: 26-Mar-2024
  • (2024)Efficient Privacy-Preserving Logistic Model With Malicious SecurityIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.340231919(5751-5766)Online publication date: 2024
  • (2024)CARGO: Crypto-Assisted Differentially Private Triangle Counting Without Trusted Servers2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00136(1671-1684)Online publication date: 13-May-2024
  • (2024)SARA: A Sparsity-Aware Efficient Oblivious Aggregation Service for Federated Matrix FactorizationWeb Information Systems Engineering – WISE 202410.1007/978-981-96-0567-5_17(227-242)Online publication date: 3-Dec-2024
  • (2023)Cryptographically Secure Private Record Linkage using Locality-Sensitive HashingProceedings of the VLDB Endowment10.14778/3626292.362629317:2(79-91)Online publication date: 1-Oct-2023
  • (2023)Falcon: A Privacy-Preserving and Interpretable Vertical Federated Learning SystemProceedings of the VLDB Endowment10.14778/3603581.360358816:10(2471-2484)Online publication date: 1-Jun-2023
  • (2023)Differentially Private Resource AllocationProceedings of the 39th Annual Computer Security Applications Conference10.1145/3627106.3627181(772-786)Online publication date: 4-Dec-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media