Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3600160.3605041acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaresConference Proceedingsconference-collections
research-article
Open access

Securing Federated GANs: Enabling Synthetic Data Generation for Health Registry Consortiums

Published: 29 August 2023 Publication History

Abstract

In this work, we review the architecture design of existing federated General Adversarial Networks (GAN) solutions and highlight the security and trust-related weaknesses in the existing designs. We then describe how these weaknesses make existing designs unsuitable for the requirements needed for a consortium of health registries working towards generating synthetic data sets for research purposes. Moreover, we propose how these weaknesses can be addressed with our novel architecture solution. Our architecture solution combines several building blocks to generate synthetic data in a decentralised setting. Federated GANs, Consortium blockchains, and Shamir Secret Sharing algorithm are the core building blocks of our proposed architecture solution. Finally, we discuss our proposed solution’s advantages, disadvantages and future research directions.

References

[1]
Ahmed Alaa, Boris Van Breugel, Evgeny S. Saveliev, and Mihaela van der Schaar. 2022. How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating and Auditing Generative Models. In Proceedings of the 39th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 162). PMLR, 290–306. https://proceedings.mlr.press/v162/alaa22a.html
[2]
Ahmed M. Alaa and Mihaela van der Schaar. 2021. Synthetic Healthcare Data Generation and Assessment: Challenges, Methods, and Impact on Machine Learning. https://icml.cc/Conferences/2021/Schedule?showEvent=10846
[3]
Elli Androulaki, Artem Barger, Vita Bortnikov, Christian Cachin, Konstantinos Christidis, Angelo De Caro, David Enyeart, Christopher Ferris, Gennady Laventman, Yacov Manevich, Srinivasan Muralidharan, Chet Murthy, Binh Nguyen, Manish Sethi, Gari Singh, Keith Smith, Alessandro Sorniotti, Chrysoula Stathakopoulou, Marko Vukolić, Sharon Weed Cocco, and Jason Yellick. 2018. Hyperledger Fabric: A Distributed Operating System for Permissioned Blockchains. In Proceedings of the Thirteenth EuroSys Conference (Porto, Portugal) (EuroSys ’18). Association for Computing Machinery, New York, NY, USA, Article 30, 15 pages. https://doi.org/10.1145/3190508.3190538
[4]
Monik Raj Behera, Sudhir Upadhyay, Suresh Shetty, Sudha Priyadarshini, Palka Patel, and Ker Farn Lee. 2022. FedSyn: Synthetic Data Generation using Federated Learning. arxiv:2203.05931
[5]
Xingjian Cao, Gang Sun, Hongfang Yu, and Mohsen Guizani. 2023. PerFED-GAN: Personalized Federated Learning via Generative Adversarial Networks. IEEE Internet of Things Journal 10, 5 (2023), 3749–3762.
[6]
Yihang Cheng, Lan Zhang, and Anran Li. 2023. GFL: Federated Learning on Non-IID Data via Privacy-Preserving Synthetic Data. In 2023 IEEE International Conference on Pervasive Computing and Communications (PerCom). 61–70. https://doi.org/10.1109/PERCOM56429.2023.10099110
[7]
Shaoming Duan, Chuanyi Liu, Peiyi Han, Xiaopeng Jin, Xinyi Zhang, Tianyu He, Hezhong Pan, and Xiayu Xiang. 2022. HT-Fed-GAN: Federated Generative Model for Decentralized Tabular Data Synthesis. Entropy 25, 1 (2022), 88.
[8]
G. Engholm, J. Ferlay, N. Christensen, F. Bray, M. L. Gjerstorff, A. Klint, J. E. Kotlum, E. Olafsdottir, E. Pukkala, and H. H. Storm. 2010. NORDCAN - a Nordic tool for cancer information, planning, quality control and research. Acta Oncologica 49, 5 (2010), 725–736. https://doi.org/10.3109/02841861003782017
[9]
Karan Ganju, Qi Wang, Wei Yang, Carl A. Gunter, and Nikita Borisov. 2018. Property Inference Attacks on Fully Connected Neural Networks using Permutation Invariant Representations. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, Toronto, ON, Canada, October 15-19, 2018, David Lie, Mohammad Mannan, Michael Backes, and XiaoFeng Wang (Eds.). ACM, 619–633. https://doi.org/10.1145/3243734.3243834
[10]
Amadou Gaye, Yannick Marcon, Julia Isaeva, Philippe LaFlamme, Andrew Turner, Elinor M Jones, Joel Minion, Andrew W Boyd, Christopher J Newby, Marja-Liisa Nuotio, 2014. DataSHIELD: taking the analysis to the data, not the data to the analysis. International journal of epidemiology 43, 6 (2014), 1929–1944.
[11]
Ali Reza Ghavamipour, Fatih Turkmen, Rui Wang, and Kaitai Liang. 2023. Federated Synthetic Data Generation with Stronger Security Guarantees. In Proceedings of the 28th ACM Symposium on Access Control Models and Technologies. 31–42.
[12]
Andre Goncalves, Priyadip Ray, Braden Soper, Jennifer Stevens, Linda Coyle, and Ana Paula Sales. 2020. Generation and evaluation of synthetic patient data. BMC medical research methodology 20, 1 (2020), 1–40. https://doi.org/10.1186/s12874-020-00977-1
[13]
Corentin Hardy, Erwan Le Merrer, and Bruno Sericola. 2018. Gossiping GANs. In DIDL 2018 - Second Workshop on Distributed Infrastructures for Deep Learning. Rennes, France. https://doi.org/10.1145/3286490.3286563
[14]
C. Hardy, E. Le Merrer, and B. Sericola. 2019. MD-GAN: Multi-Discriminator Generative Adversarial Networks for Distributed Datasets. In 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE Computer Society, Los Alamitos, CA, USA, 866–877. https://doi.org/10.1109/IPDPS.2019.00095
[15]
Suranga N Kasthurirathne, Gregory Dexter, and Shaun J Grannis. 2021. Generative Adversarial Networks for Creating Synthetic Free-Text Medical Data: A Proposal for Collaborative Research and Re-use of Machine Learning Models. AMIA Summits on Translational Science Proceedings 2021 (2021), 335.
[16]
Arturo Moncada-Torres, Frank Martin, Melle Sieswerda, Johan Van Soest, and Gijs Geleijnse. 2020. VANTAGE6: an open source priVAcy preserviNg federaTed leArninG infrastructurE for Secure Insight eXchange. In AMIA Annual Symposium Proceedings, Vol. 2020. American Medical Informatics Association, 870.
[17]
Ji Qi, Xusheng Chen, Yunpeng Jiang, Jianyu Jiang, Tianxiang Shen, Shixiong Zhao, Sen Wang, Gong Zhang, Li Chen, Man Ho Au, and Heming Cui. 2021. Bidl: A High-Throughput, Low-Latency Permissioned Blockchain Framework for Datacenter Networks. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles (Virtual Event, Germany) (SOSP ’21). Association for Computing Machinery, New York, NY, USA, 18–34. https://doi.org/10.1145/3477132.3483574
[18]
Mohammad Rasouli, Tao Sun, and Ram Rajagopal. 2020. FedGAN: Federated Generative Adversarial Networks for Distributed Data. https://doi.org/10.48550/ARXIV.2006.07228
[19]
Mehdi SM Sajjadi, Olivier Bachem, Mario Lucic, Olivier Bousquet, and Sylvain Gelly. 2018. Assessing generative models via precision and recall. Advances in neural information processing systems 31 (2018).
[20]
Adi Shamir. 1979. How to Share a Secret. Commun. ACM 22, 11 (nov 1979), 612–613. https://doi.org/10.1145/359168.359176
[21]
Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Membership Inference Attacks against Machine Learning Models. arxiv:1610.05820 [cs.CR]
[22]
Narasimha Raghavan Veeraragavan and Jan Franz Nygård. 2021. DeCanSec: A Decentralized Architecture for Secure Statistical Computations on Distributed Health Registry Data. In Proceedings of the 16th International Conference on Availability, Reliability and Security. Association for Computing Machinery, New York, NY, USA, Article 140, 9 pages. https://doi.org/10.1145/3465481.3470071
[23]
Bangzhou Xin, Yangyang Geng, Teng Hu, Sheng Chen, Wei Yang, Shaowei Wang, and Liusheng Huang. 2022. Federated synthetic data generation with differential privacy. Neurocomputing 468 (2022), 1–10.
[24]
Bangzhou Xin, Wei Yang, Yangyang Geng, Sheng Chen, Shaowei Wang, and Liusheng Huang. 2020. Private FL-GAN: Differential Privacy Synthetic Data Generation Based on Federated Learning. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2927–2931. https://doi.org/10.1109/ICASSP40776.2020.9054559
[25]
Ligeng Zhu, Zhijian Liu, and Song Han. 2019. Deep Leakage from Gradients. arxiv:1906.08935 [cs.LG]

Cited By

View all
  • (2024)Leveraging Artificial Intelligence Models Using Synthetic DataProceedings of the Future Technologies Conference (FTC) 2024, Volume 110.1007/978-3-031-73110-5_4(56-66)Online publication date: 5-Nov-2024
  • (2023)Federated Learning Showdown: The Comparative Analysis of Federated Learning Frameworks2023 Eighth International Conference on Fog and Mobile Edge Computing (FMEC)10.1109/FMEC59375.2023.10305961(224-231)Online publication date: 18-Sep-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ARES '23: Proceedings of the 18th International Conference on Availability, Reliability and Security
August 2023
1440 pages
ISBN:9798400707728
DOI:10.1145/3600160
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 August 2023

Check for updates

Author Tags

  1. Consortium blockchain
  2. Federated learning
  3. GAN
  4. Health Registry
  5. Shamir Secret Sharing

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ARES 2023

Acceptance Rates

Overall Acceptance Rate 228 of 451 submissions, 51%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)214
  • Downloads (Last 6 weeks)32
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Leveraging Artificial Intelligence Models Using Synthetic DataProceedings of the Future Technologies Conference (FTC) 2024, Volume 110.1007/978-3-031-73110-5_4(56-66)Online publication date: 5-Nov-2024
  • (2023)Federated Learning Showdown: The Comparative Analysis of Federated Learning Frameworks2023 Eighth International Conference on Fog and Mobile Edge Computing (FMEC)10.1109/FMEC59375.2023.10305961(224-231)Online publication date: 18-Sep-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media