Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2723372.2749453acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Chiaroscuro: Transparency and Privacy for Massive Personal Time-Series Clustering

Published: 27 May 2015 Publication History

Abstract

The advent of on-body/at-home sensors connected to personal devices leads to the generation of fine grain highly sensitive personal data at an unprecendent rate. However, despite the promises of large scale analytics there are obvious privacy concerns that prevent individuals to share their personnal data. In this paper, we propose Chiaroscuro, a complete solution for clustering personal data with strong privacy guarantees. The execution sequence produced by Chiaroscuro is massively distributed on personal devices, coping with arbitrary connections and disconnections. Chiaroscuro builds on our novel data structure, called Diptych, which allows the participating devices to collaborate privately by combining encryption with differential privacy. Our solution yields a high clustering quality while minimizing the impact of the differentially private perturbation. Chiaroscuro is both correct and secure. Finally, we provide an experimental validation of our approach on both real and synthetic sets of time-series.

References

[1]
G. Ács and C. Castelluccia. I have a dream!: Differentially private smart metering. In IH, pages 118--132, 2011.
[2]
T. Allard, N. Anciaux, L. Bouganim, Y. Guo, L. Le Folgoc, B. Nguyen, P. Pucheral, I. Ray, I. Ray, and S. Yin. Secure personal data servers: A vision paper. VLDB, 3(1--2):25--35, 2010.
[3]
O. T. Alliance. Data Protection & Breach, 2014.
[4]
L. Bottou and Y. Bengio. Convergence properties of the kmeans algorithm. In ANIPS, volume 7. Denver, 1995.
[5]
P. Bunn and R. Ostrovsky. Secure two-party k-means clustering. In CCS, pages 486--497, 2007.
[6]
R. Chen, A. Reznichenko, P. Francis, and J. Gehrke. Towards statistical queries over distributed private user data. In NSDI, pages 169--182, 2012.
[7]
L. Claret, M. Gupta, K. Han, A. Joshi, N. Sarapa, J. He, B. Powell, and R. Bruno. Evaluation of tumor-size response metrics to predict overall survival in western and chinese patients with first-line metastatic colorectal cancer. J. Clin. Onc., 31(17):2110--2114, 2013.
[8]
I. Damgård and M. Jurik. A generalisation, a simplification and some applications of paillier's probabilistic public-key system. In PKC, pages 119--136, 2001.
[9]
L. D. P. dos Santos, A. G. da Silva, B. Jacquin, M.-L. Picard, D. Worms, and C. Bernard. Massive smart meter data storage and processing on top of hadoop. In Int. Work. on End-to-end Man. of Big Data, VLDB, 2012.
[10]
C. Dwork. Differential privacy. In ICALP, pages 1--12, 2006.
[11]
C. Dwork. A firm foundation for private data analysis. CACM, 54(1):86--95, 2011.
[12]
C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In TCC, pages 265--284, 2006.
[13]
B. Goethals, S. Laur, H. Lipmaa, and T. Mielikäinen. On private scalar product computation for privacy-preserving data mining. In ICISC, pages 104--120, 2004.
[14]
O. Goldreich. Foundations of cryptography: a primer. Found. Trends in Theoretical Computer Science, 1(1):1--116, April 2005.
[15]
O. Goldreich, S. Micali, and A. Wigderson. How to play ANY mental game. In STOC, pages 218--229, 1987.
[16]
ISSDA. The Commission for Energy Regulation, Electricity Customer Behaviour Trial, 2012. http://www.ucd.ie/issda.
[17]
G. Jagannathan, K. Pillaipakkamnatt, and D. Umano. A secure clustering algorithm for distributed data streams. In ICDM Work., pages 705--710, 2007.
[18]
G. Jagannathan, K. Pillaipakkamnatt, and R. N. Wright. A new privacy-preserving distributed k-clustering algorithm. In SDM, pages 494--498, 2006.
[19]
G. Jagannathan, K. Pillaipakkamnatt, R. N. Wright, and D. Umano. Communication-efficient privacy-preserving clustering. Trans. Data Privacy, 3(1):1--25, 2010.
[20]
G. Jagannathan and R. N. Wright. Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In SIGKDD, pages 593--599, 2005.
[21]
M. Jelasity, A. Montresor, and O. Babaoglu. Gossip-based aggregation in large dynamic networks. Trans. Comp. Sys., 23(3):219--252, Aug. 2005.
[22]
S. Jha, L. Kruger, and P. McDaniel. Privacy preserving clustering. In ESORICS, pages 397--417, 2005.
[23]
D. Kempe, A. Dobra, and J. Gehrke. Gossip-based computation of aggregate information. In FOCS, pages 482--491, 2003.
[24]
S. Kotz, T. J. Kozubowski, and K. Podgorski. The Laplace Distribution and Generalizations. 2001.
[25]
W. Kowalczyk and N. A. Vlassis. Newscast EM. In NIPS, pages 713--720, 2004.
[26]
X. Lin, C. Clifton, and M. Zhu. Privacy-preserving clustering with distributed EM mixture modeling. Know. Inf. Sys., 8(1):68--81, 2005.
[27]
J. Liu, J. Z. Huang, J. Luo, and L. Xiong. Privacy preserving distributed DBSCAN clustering. In EDBT-ICDT Work., pages 177--185, 2012.
[28]
S. Lloyd. Least squares quantization in PCM. Trans. Inf. Theor., 28(2):129--137, 1982.
[29]
A. Machanavajjhala, D. Kifer, J. Abowd, J. Gehrke, and L. Vilhuber. Privacy: Theory meets practice on the map. In ICDE, pages 277--286, 2008.
[30]
A. Montresor and M. Jelasity. PeerSim: A scalable P2P simulator. In P2P, pages 99--100, 2009.
[31]
M. Newborough and P. Augood. Demand-side management opportunities for the uk domestic sector. Gen., Trans. and Dist., 146(3):283--293, 1999.
[32]
A. Prudenzi. A neuron nets based procedure for identifying domestic appliances pattern-of-use from energy recordings at meter panel. In PESW, volume 2, pages 941--946, 2002.
[33]
V. Rastogi and S. Nath. Differentially private aggregation of distributed time-series with transformation and encryption. In SIGMOD, pages 735--746, 2010.
[34]
I. Risk Based Security. Data Breach QuickView, 2014.
[35]
J. Sakuma and S. Kobayashi. Large-scale k-means clustering with user-centric privacy preservation. In PAKDD, pages 320--332, 2008.
[36]
E. Shi, T.-H. H. Chan, E. G. Rieffel, R. Chow, and D. Song. Privacy-preserving aggregation of time-series data. In NDSS, 2011.
[37]
A. C. Yao. Protocols for secure computations. In FOCS, pages 160--164, 1982.

Cited By

View all
  • (2024) DPGazeSynthInformation Sciences: an International Journal10.1016/j.ins.2024.120720675:COnline publication date: 1-Jul-2024
  • (2022)Edgelet Computing: Pushing Query Processing and Liability at the Extreme Edge of the Network2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid54584.2022.00025(160-169)Online publication date: May-2022
  • (2020)PriRadar: A Privacy-Preserving Framework for Spatial CrowdsourcingIEEE Transactions on Information Forensics and Security10.1109/TIFS.2019.291323215(299-314)Online publication date: 1-Jan-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
May 2015
2110 pages
ISBN:9781450327589
DOI:10.1145/2723372
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 May 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. clustering
  2. differential privacy
  3. gossip
  4. k-means
  5. secure multi-party computation
  6. sensors
  7. time-series

Qualifiers

  • Research-article

Conference

SIGMOD/PODS'15
Sponsor:
SIGMOD/PODS'15: International Conference on Management of Data
May 31 - June 4, 2015
Victoria, Melbourne, Australia

Acceptance Rates

SIGMOD '15 Paper Acceptance Rate 106 of 415 submissions, 26%;
Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024) DPGazeSynthInformation Sciences: an International Journal10.1016/j.ins.2024.120720675:COnline publication date: 1-Jul-2024
  • (2022)Edgelet Computing: Pushing Query Processing and Liability at the Extreme Edge of the Network2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid54584.2022.00025(160-169)Online publication date: May-2022
  • (2020)PriRadar: A Privacy-Preserving Framework for Spatial CrowdsourcingIEEE Transactions on Information Forensics and Security10.1109/TIFS.2019.291323215(299-314)Online publication date: 1-Jan-2020
  • (2020)Secure Distributed Queries over Large Sets of Personal Home BoxesTransactions on Large-Scale Data- and Knowledge-Centered Systems XLIV10.1007/978-3-662-62271-1_4(108-131)Online publication date: 10-Sep-2020
  • (2019)Trustworthy Distributed Computations on Personal Data Using Trusted Execution Environments2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE)10.1109/TrustCom/BigDataSE.2019.00058(381-388)Online publication date: Aug-2019
  • (2019)Robust Privacy-Preserving Gossip AveragingStabilization, Safety, and Security of Distributed Systems10.1007/978-3-030-34992-9_4(38-52)Online publication date: 14-Nov-2019
  • (2019)Peer-to-Peer Data ManagementPrinciples of Distributed Database Systems10.1007/978-3-030-26253-2_9(395-448)Online publication date: 3-Dec-2019
  • (2019)Parallel Database SystemsPrinciples of Distributed Database Systems10.1007/978-3-030-26253-2_8(349-394)Online publication date: 3-Dec-2019
  • (2019)Database Integration—Multidatabase SystemsPrinciples of Distributed Database Systems10.1007/978-3-030-26253-2_7(281-347)Online publication date: 3-Dec-2019
  • (2019)Data ReplicationPrinciples of Distributed Database Systems10.1007/978-3-030-26253-2_6(247-280)Online publication date: 3-Dec-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media