Locally Differentially Private and Consistent Frequency Estimation of Longitudinal Data
Resumo
Local Differential Privacy (LDP) was developed as a Differential Privacy (DP) model that protects user data from the collector. However, tasks such as frequency estimation over time are challenging to apply LDP guarantees to, as privacy and utility goals are subjected to increasing privacy budget consumption. Utility can be enhanced through post-processing techniques, but it's important to be aware that they may introduce unintended bias. In this paper, we analyze the performance of a range of longitudinal LDP protocols coupled with various post-processing techniques, of which we determined Norm Sub and PowerNS to be the best-performing and warned against the use of Norm Mul.
Palavras-chave:
Local Differential Privacy, Frequency Estimation, Longitudinal Data, Experimental Analysis
Referências
H. H. Arcolezi, J.-F. Couchot, B. Al Bouna, and X. Xiao. Improving the utility of locally differentially private protocols for longitudinal and multidimensional frequency estimates. Digital Communications and Networks, 2022a.
H. H. Arcolezi, C. Pinzón, C. Palamidessi, and S. Gambs. Frequency estimation of evolving data under local differential privacy. arXiv preprint arXiv:2210.00262, 2022b.
B. Ding, J. Kulkarni, and S. Yekhanin. Collecting telemetry data privately. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 3574–3583, Red Hook, NY, USA, 2017. Curran Associates Inc. ISBN 9781510860964.
C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, pages 265–284. Springer, 2006.
C. Dwork, A. Roth, et al. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3–4):211–407, 2014.
Ú. Erlingsson, V. Pihur, and A. Korolova. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, pages 1054–1067, 2014.
J. S. C. Filho and J. C. Machado. Felip: A local differentially private approach to frequency estimation on multidimensional datasets. In International Conference on Extending Database Technology, 2023.
D. Hong, W. Jung, and K. Shim. Collecting geospatial data with local differential privacy for personalized services. In 2021 IEEE 37th International Conference on Data Engineering (ICDE), pages 2237–2242, 2021.
N. Johnson, J. P. Near, and D. Song. Towards practical differential privacy for sql queries. 11(5), 2018. ISSN 2150-8097.
P. Kairouz, K. Bonawitz, and D. Ramage. Discrete distribution estimation under local privacy. In International Conference on Machine Learning, pages 2436–2444. PMLR, 2016.
G. Liu, P. Tang, C. Hu, C. Jin, and S. Guo. Multi-dimensional data publishing with local differential privacy. In Proceedings 26th International Conference on Extending Database Technology, EDBT 2023, Ioannina, Greece, March 28-31, 2023, pages 183–194. OpenProceedings.org, 2023.
X. Ren, L. Shi, W. Yu, S. Yang, C. Zhao, and Z. Xu. Ldp-ids: Local differential privacy for infinite data streams. In Proceedings of the 2022 International Conference on Management of Data, SIGMOD ’22, page 1064–1077, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450392495.
A. D. P. Team. Learning with privacy at scale. 2017. URL [link].
T. Wang, J. Blocki, N. Li, and S. Jha. Locally differentially private protocols for frequency estimation. In 26th USENIX Security Symposium (USENIX Security 17), pages 729–745, 2017.
T. Wang, M. Lopuhaä-Zwakenberg, Z. Li, B. Skoric, and N. Li. Locally differentially private frequency estimation with consistency. arXiv preprint arXiv:1905.08320, 2019.
T. Wang, J. Q. Chen, Z. Zhang, D. Su, Y. Cheng, Z. Li, N. Li, and S. Jha. Continuous release of data streams under both centralized and local differential privacy. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, CCS ’21, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450384544.
S. L. Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309):63–69, 1965.
Y. Zhu, Y. Cao, Q. Xue, Q. Wu, and Y. Zhang. Heavy hitter identification over large-domain set-valued data with local differential privacy. IEEE Transactions on Information Forensics and Security, 19:414–426, 2024. DOI: 10.1109/TIFS.2023.3324726.
H. H. Arcolezi, C. Pinzón, C. Palamidessi, and S. Gambs. Frequency estimation of evolving data under local differential privacy. arXiv preprint arXiv:2210.00262, 2022b.
B. Ding, J. Kulkarni, and S. Yekhanin. Collecting telemetry data privately. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 3574–3583, Red Hook, NY, USA, 2017. Curran Associates Inc. ISBN 9781510860964.
C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, pages 265–284. Springer, 2006.
C. Dwork, A. Roth, et al. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3–4):211–407, 2014.
Ú. Erlingsson, V. Pihur, and A. Korolova. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, pages 1054–1067, 2014.
J. S. C. Filho and J. C. Machado. Felip: A local differentially private approach to frequency estimation on multidimensional datasets. In International Conference on Extending Database Technology, 2023.
D. Hong, W. Jung, and K. Shim. Collecting geospatial data with local differential privacy for personalized services. In 2021 IEEE 37th International Conference on Data Engineering (ICDE), pages 2237–2242, 2021.
N. Johnson, J. P. Near, and D. Song. Towards practical differential privacy for sql queries. 11(5), 2018. ISSN 2150-8097.
P. Kairouz, K. Bonawitz, and D. Ramage. Discrete distribution estimation under local privacy. In International Conference on Machine Learning, pages 2436–2444. PMLR, 2016.
G. Liu, P. Tang, C. Hu, C. Jin, and S. Guo. Multi-dimensional data publishing with local differential privacy. In Proceedings 26th International Conference on Extending Database Technology, EDBT 2023, Ioannina, Greece, March 28-31, 2023, pages 183–194. OpenProceedings.org, 2023.
X. Ren, L. Shi, W. Yu, S. Yang, C. Zhao, and Z. Xu. Ldp-ids: Local differential privacy for infinite data streams. In Proceedings of the 2022 International Conference on Management of Data, SIGMOD ’22, page 1064–1077, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450392495.
A. D. P. Team. Learning with privacy at scale. 2017. URL [link].
T. Wang, J. Blocki, N. Li, and S. Jha. Locally differentially private protocols for frequency estimation. In 26th USENIX Security Symposium (USENIX Security 17), pages 729–745, 2017.
T. Wang, M. Lopuhaä-Zwakenberg, Z. Li, B. Skoric, and N. Li. Locally differentially private frequency estimation with consistency. arXiv preprint arXiv:1905.08320, 2019.
T. Wang, J. Q. Chen, Z. Zhang, D. Su, Y. Cheng, Z. Li, N. Li, and S. Jha. Continuous release of data streams under both centralized and local differential privacy. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, CCS ’21, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450384544.
S. L. Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309):63–69, 1965.
Y. Zhu, Y. Cao, Q. Xue, Q. Wu, and Y. Zhang. Heavy hitter identification over large-domain set-valued data with local differential privacy. IEEE Transactions on Information Forensics and Security, 19:414–426, 2024. DOI: 10.1109/TIFS.2023.3324726.
Publicado
14/10/2024
Como Citar
MARREIRAS NETO, Antonio A.; DUARTE NETO, Eduardo R.; COSTA FILHO, José S.; MACHADO, Javam C..
Locally Differentially Private and Consistent Frequency Estimation of Longitudinal Data. In: SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 39. , 2024, Florianópolis/SC.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 367-380.
ISSN 2763-8979.
DOI: https://doi.org/10.5753/sbbd.2024.240837.