Abstract
Wireless trace data play an important role in wireless network researches. However, publishing the raw WLAN traces poses potential privacy risks of network users. Therefore, it is necessary to sanitize users’ sensitive information before these traces are published, and provide high data utility for wireless network researches as well. Although some existing works based on various anonymization methods have started to address the problem of sanitizing WLAN traces, we find the anonymization techniques cannot provide strong and provable privacy guarantee by analyzing a real WLAN trace dataset. Differential Privacy is the only framework that can provide strong and provable privacy guarantee. However, our analysis shows that existing studies on differential privacy fail to provide effective data utility for query operations on multi-dimensional and large-scale datasets. Aiming at WLAN trace datasets that have unique characteristics of multi-dimensional and large-scale, this paper proposes a privacy-preserving data publishing algorithm which not only satisfies differential privacy but also realizes high data utility for query operations. We prove that the proposed sanitization algorithm satisfies \(\epsilon \)-differential privacy. Furthermore, the theoretical analysis shows the noise variance in our sanitization algorithm is \(O(\log ^{o(1)}n/\epsilon ^2)\) which indicates the algorithm can achieve high data utility on large-scale datasets. Moreover, from the results of extensive experiments on an enterprise-scale WLAN trace dataset, we also show that our sanitization algorithm can provide high data utility for query operations.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bao H, Lu R (2015) A new differentially private data aggregation with fault tolerance for smart grid communications. IEEE Internet Things 2(3):248–258
Barak B, Chaudhuri K, Dwork C et al (2007) Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In: Proceedings of the 26th ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, pp 273–282
Chakrabarti K, Garofalakis M, Rastogi R et al (2001) Approximate query processing using wavelets. Int J Very Large Data Bases 10(2–3):199–223
Chen R, Mohammed N, Fung BC et al (2011) Publishing set- valued data via differential privacy. In: Proceedings of the VLDB endowment, pp 1087–1098
Chen R, Acs G, Castelluccia C (2012) Differentially private sequential data publication via variable-length n-grams. In: Proceedings of the 2012 ACM conference on computer and communications security, pp 638–649
Chow CY, Mokbel MF (2011) Trajectory privacy in location-based services and data publication. ACM SIGKDD Explor Newsl 13(1):19–29
Coull SE, Wright CV, Monrose F et al (2007) Playing devils advocate: Inferring sensitive information from anonymized network traces. In: Proceedings of the 14th network and distributed system security symposium. ISOC, pp 35–47
De Oliveira EC, De Albuquerque CV (2009) Nectar: a dtn routing protocol based on neighborhood contact history. In: Proceedings of the 2009 ACM symposium on applied computing, ACM, pp 40–46
Domingo-Ferrer J, Mateo-Sanz JM (2002) Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans Knowl Data Eng 14(1):189–201
Dwork C (2008) Differential privacy: a survey of results. In: Proceedings of theory and applications of models of computation, pp 1–19
Dwork C (2011) Differential privacy. Encyclopedia of Cryptography and Security, pp 338–340
Dwork C, McSherry F, Nissim K et al (2006) Calibrating noise to sensitivity in private data analysis. Theory of cryptography, pp 265-284
Ester M, Kohlhammer J, Kriegel HP (2000) The dc-tree: a fully dynamic index structure for data warehouses. In: Proceedings of the 16th IEEE international conference on data engineering, pp 379–388
Ganta SR, Kasiviswanathan SP, Smith A (2008) Composition attacks and auxiliary information in data privacy. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 265–273
Geng Q, Viswanath P (2016) Optimal noise adding mechanisms for approximate differential privacy. IEEE Trans Inf Theory 62(2):952–969
Hay M, Rastogi V, Miklau G et al (2010) Boosting the accuracy of differentially private histograms through consistency. In: Proceedings of the VLDB endowment, pp 1021–1032
Kairouz P, Oh S, Viswanath P (2017) The composition theorem for differential privacy. IEEE Trans Inf Theory 63(6):4037–4049
Kim JW, Kim DH, Jang B (2018) Application of local differential privacy to collection of indoor positioning data. IEEE Access 6:4276–4286
Kumar U, Helmy A (2009) Human behavior and challenges of anonymizing wlan traces. In: Proceedings of the 2009 IEEE global telecommunications conference, Honolulu, USA, pp 1–6
LeFevre K, DeWitt DJ, Ramakrishnan R (2006) Mondrian multidimensional k-anonymity. In: Proceedings of the 22nd international conference on data engineering, pp 1–12
Li N, Li T, Venkatasubramanian S (2007) t-closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of the 23rd IEEE international conference on data engineering, pp 106–115
Machanavajjhala A, Kifer D, Gehrke J et al (2007) l-diversity: privacy beyond k-anonymity. ACM Trans Knowl Discov Data 1(1):3–55
McSherry F, Mahajan R (2011) Differentially-private network trace analysis. ACM SIGCOMM Comput Commun Rev 41(4):123–134
Mohammed N, Chen R, Fung B et al (2011) Differentially private data release for data mining. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, pp 493–501
Roussopoulos N, Kotidis Y, Roussopoulos M (1997) Cubetree: organization of and bulk incremental updates on the data cube. In: Proceedings of ACM SIGMOD record, pp 89–99
Sarat S, Terzis A (2007) On the detection and origin identification of mobile worms. In: Proceedings of the 2007 ACM workshop on recurring Malcode, pp 54–60
Sei Y, Ohsuga A (2017) Differential private data collection and analysis based on randomized multiple dummies for untrusted mobile crowdsensing. IEEE Trans Inf Forensics Secur 12(4):926–939
Slagell AJ, Lakkaraju K, Luo K (2006) Flaim: a multi-level anonymization framework for computer and network logs. In: LISA, pp 3–8
Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10(5):557–70
Tan K, Yan G, Yeo J et al (April 2011) Privacy analysis of user association logs in a large-scale wireless lan. In: Proceedings of the 2011 international conference on computer communications, Shanghai, China, pp 31–35
To H, Ghinita G, Fan L et al (2017) Differentially private location protection for worker datasets in spatial crowdsourcing. IEEE Trans Mob Comput 16(4):934–949
Tong W, Hua J, Zhong S (2017) A jointly differentially private scheduling protocol for ridesharing services. IEEE Trans Inf Forensics Secur 12(10):2444–2456
Wang Y, Huang Z, Mitra S et al (2017) Differential privacy in linear distributed control systems: entropy minimizing mechanisms and performance tradeoffs. IEEE Trans Control Netw Syst 4(1):118–130
Wang Q, Chen D, Zhang N et al (2017) PCP: a privacy-preserving content-based publish-subscribe scheme with differential privacy in fog computing. IEEE Access 5:17962–17974
Wj Hsu, Dutta D, Helmy A (2012) Structural analysis of user association patterns in university campus wireless lans. IEEE Trans Mob Comput 11(11):1734–1748
Wong RCW, Fu AWC, Wang K et al (2009) Anonymization-based attacks in privacy-preserving data publishing. ACM Trans Database Syst 34(2):1–46
Xiao X, Bender G, Hay M et al (2011a) ireduct: differential privacy with reduced relative errors. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, pp 229–240
Xiao X, Wang G, Gehrke J (2011b) Differential privacy via wavelet transforms. IEEE Trans Knowl Data Eng 23(8):1200–1214
Xu J, Zhang Z, Xiao X et al (2013) Differentially private histogram publication. Int J Very Large Data Bases 22(6):797–822
Zhu T, Li G, Zhou W et al (2017) Differentially private data publishing and analysis: a survey. IEEE Trans Knowl Data Eng 28(8):1619–1638
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, D., Dong, X. & Li, M. Differential privacy for publishing enterprise-scale WLAN traces. Soft Comput 23, 5667–5682 (2019). https://doi.org/10.1007/s00500-018-3223-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-018-3223-9