research-article

Open access

Federated Learning with Label-Masking Distillation

Authors:

Shiming GeAuthors Info & Claims

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 222 - 232

https://doi.org/10.1145/3581783.3611984

Published: 27 October 2023 Publication History

Abstract

Federated learning provides a privacy-preserving manner to collaboratively train models on data distributed over multiple local clients via the coordination of a global server. In this paper, we focus on label distribution skew in federated learning, where due to the different user behavior of the client, label distributions between different clients are significantly different. When faced with such cases, most existing methods will lead to a suboptimal optimization due to the inadequate utilization of label distribution information in clients. Inspired by this, we propose a label-masking distillation approach termed FedLMD to facilitate federated learning via perceiving the various label distributions of each client. We classify the labels into majority and minority labels based on the number of examples per class during training. The client model learns the knowledge of majority labels from local data. The process of distillation masks out the predictions of majority labels from the global model, so that it can focus more on preserving the minority label knowledge of the client. A series of experiments show that the proposed approach can achieve state-of-the-art performance in various cases. Moreover, considering the limited resources of the clients, we propose a variant FedLMD-Tf that does not require an additional teacher, which outperforms previous lightweight approaches without increasing computational costs. Our code is available at https://github.com/wnma3mz/FedLMD.

Supplemental Material

MP4 File

Presentation video for federated learning with label-masking distillation

Download
12.14 MB

References

[1]

Durmus Alp Emre Acar, Yue Zhao, Ramon Matas, Matthew Mattina, Paul Whatmough, and Venkatesh Saligrama. 2021. Federated Learning Based on Dynamic Regularization. In ICLR.

[2]

Preston Bukaty. 2019. The California Consumer Privacy Act (CCPA): An Implementation Guide. IT Governance Publishing.

[3]

Hongyou Chen and Weilun Chao. 2021. FedBE: Making Bayesian Model Ensemble Applicable to Federated Learning. In ICLR.

[4]

Yae Jee Cho, Andre Manoel, Gauri Joshi, Robert Sim, and Dimitrios Dimitriadis. 2022. Heterogeneous Ensemble Knowledge Transfer for Training Large Models in Federated Learning. In IJCAI. 2881--2887.

[5]

Luke N Darlow, Elliot J Crowley, Antreas Antoniou, and Amos J Storkey. 2018. Cinic-10 is not imagenet or cifar-10. arXiv:1810.03505 (2018).

[6]

Xuan Gong, Abhishek Sharma, Srikrishna Karanam, Ziyan Wu, Terrence Chen, David S. Doermann, and Arun Innanje. 2021. Ensemble Attention Distillation for Privacy-Preserving Federated Learning. In ICCV. 15056--15066.

[7]

Bo Han, Gang Niu, Xingrui Yu, Quanming Yao, Miao Xu, Ivor W. Tsang, and Masashi Sugiyama. 2020. SIGUA: Forgetting May Make Learning with Noisy Labels More Robust. In ICML. 4006--4016.

[8]

Yuting He, Yiqiang Chen, XiaoDong Yang, Hanchao Yu, Yi-Hua Huang, and Yang Gu. 2022a. Learning Critically: Selective Self-Distillation in Federated Learning on Non-IID Data. IEEE Trans. Big Data (2022).

[9]

Yuting He, Yiqiang Chen, Xiaodong Yang, Yingwei Zhang, and Bixiao Zeng. 2022b. Class-Wise Adaptive Self Distillation for Federated Learning on Non-IID Data (Student Abstract). In AAAI. 12967--12968.

[10]

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the Knowledge in a Neural Network. In NeurIPS Workshop.

[11]

Tzu-Ming Harry Hsu, Hang Qi, and Matthew Brown. 2019. Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification. arXiv:1909.06335 (2019).

[12]

Ziheng Hu, Hongtao Xie, Lingyun Yu, Xingyu Gao, Zhihua Shang, and Yongdong Zhang. 2022. Dynamic-Aware Federated Learning for Face Forgery Video Detection. ACM Trans. Intell. Syst. Technol., Vol. 13, 4 (2022).

Digital Library

[13]

Eunjeong Jeong, Seungeun Oh, Hyesung Kim, et al. 2018. Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation under Non-IID Private Data. In NeurIPS Workshop.

[14]

Peter Kairouz, H Brendan McMahan, Brendan Avent, et al. 2021. Advances and Open Problems in Federated Learning. Found. Trends Mach. Learn., Vol. 14, 1--2 (2021), 1--210.

Digital Library

[15]

Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank J. Reddi, Sebastian U. Stich, and Ananda Theertha Suresh. 2020. SCAFFOLD: Stochastic Controlled Averaging for Federated Learning. In ICML. 5132--5143.

[16]

Youngdong Kim, Junho Yim, Juseung Yun, and Junmo Kim. 2019. NLNL: Negative Learning for Noisy Labels. In ICCV. 101--110.

[17]

Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. Master's thesis, Department of Computer Science, University of Toronto (2009).

[18]

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE, Vol. 86, 11 (1998), 2278--2324.

[19]

Gihun Lee, Minchan Jeong, Yongjin Shin, Sangmin Bae, and Se-Young Yun. 2022. Preservation of Global Knowledge by Not-True Distillation in Federated Learning. In NeurIPS. 38461--38474.

[20]

Qinbin Li, Bingsheng He, and Dawn Song. 2021. Model-Contrastive Federated Learning. In CVPR. 10713--10722.

[21]

Qiushi Li, Wenwu Zhu, Chao Wu, Xinglin Pan, Fan Yang, Yuezhi Zhou, and Yaoxue Zhang. 2020c. InvisibleFL: federated learning over non-informative intermediate updates against multimedia privacy leakages. In ACM Multimedia. 753--762.

[22]

Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. 2020a. Federated Learning: Challenges, Methods, and Future Directions. IEEE Signal Processing Magazine, Vol. 37, 3 (2020), 50--60.

[23]

Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. 2020b. Federated Optimization in Heterogeneous Networks. In MLSys. 429--450.

[24]

Wenqi Li, Fausto Milletar`i, Daguang Xu, Nicola Rieke, Jonny Hancox, Wentao Zhu, Maximilian Baust, Yan Cheng, Sébastien Ourselin, M Jorge Cardoso, et al. 2019. Privacy-Preserving Federated Brain Tumour Segmentation. In MLMI. 133--141.

[25]

Xin-Chun Li and De-Chuan Zhan. 2021. FedRS: Federated Learning with Restricted Softmax for Label Distribution Non-IID Data. In SIGKDD. 995--1005.

[26]

Tao Lin, Lingjing Kong, Sebastian U Stich, and Martin Jaggi. 2020. Ensemble Distillation for Robust Model Fusion in Federated Learning. In NeurIPS. 2351--2363.

[27]

Quande Liu, Cheng Chen, Jing Qin, Qi Dou, and Pheng-Ann Heng. 2021. FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space. CVPR (2021).

[28]

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data. In AISTATS, Vol. 54. 1273--1282.

[29]

Viraaji Mothukuri, Reza M Parizi, Seyedamin Pouriyeh, Yan Huang, Ali Dehghantanha, and Gautam Srivastava. 2021. A survey on security and privacy of federated learning. Future Generation Computer Systems, Vol. 115 (2021), 619--640.

Digital Library

[30]

Dinh C. Nguyen, Quoc-Viet Pham, Pubudu N. Pathirana, Ming Ding, Aruna Seneviratne, Zihuai Lin, Octavia A. Dobre, and Won-Joo Hwang. 2023. Federated Learning for Smart Healthcare: A Survey. ACM Comput. Surv., Vol. 55, 3 (2023), 60:1--60:37.

Digital Library

[31]

Wanning Pan and Lichao Sun. 2021. Local-Global Knowledge Distillation in Heterogeneous Federated Learning with Non-IID Data. arXiv:2107.00051 (2021).

[32]

Albrecht Jan Philipp. 2016. How the GDPR will change the world. European Data Protection Law Review, Vol. 2, 3 (2016), 287.

[33]

Fan Qi, Zixin Zhang, Xianshan Yang, Huaiwen Zhang, and Changsheng Xu. 2022a. Feeling Without Sharing: A Federated Video Emotion Recognition Framework Via Privacy-Agnostic Hybrid Aggregation. In ACM Multimedia (Lisboa, Portugal) (MM '22). 151--160.

[34]

Fan Qi, Zixin Zhang, Xianshan Yang, Huaiwen Zhang, and Changsheng Xu. 2022b. Feeling Without Sharing: A Federated Video Emotion Recognition Framework Via Privacy-Agnostic Hybrid Aggregation. In ACM Multimedia. 151--160.

[35]

Samuel W. Remedios, John A. Butman, Bennett A. Landman, and Dzung L. Pham. 2020. Federated Gradient Averaging for Multi-Site Training with Momentum-Based Optimizers. In MICCAI Workshop (Lecture Notes in Computer Science, Vol. 12444). 170--180.

[36]

Felix Sattler, Tim Korjakow, Roman Rischke, and Wojciech Samek. 2021. FedAUX: Leveraging Unlabeled Auxiliary Data in Federated Learning. IEEE TNNLS (2021), 1--13.

[37]

Osama Shahid, Seyedamin Pouriyeh, Reza M Parizi, Quan Z Sheng, Gautam Srivastava, and Liang Zhao. 2021. Communication Efficiency in Federated Learning: Achievements and Challenges. arXiv:2107.10996 (2021).

[38]

Haizhou Shi, Youcai Zhang, Zijin Shen, Siliang Tang, Yaqian Li, Yandong Guo, and Yueting Zhuang. 2021. Federated Self-Supervised Contrastive Learning via Ensemble Similarity Distillation. arXiv:2109.14611 (2021).

[39]

Neta Shoham, Tomer Avidor, Aviv Keren, Nadav Israel, Daniel Benditkis, Liron Mor-Yosef, and Itai Zeitak. 2019. Overcoming forgetting in federated learning on non-iid data. arXiv:1910.07796 (2019).

[40]

Dianbo Sui, Yubo Chen, Jun Zhao, Yantao Jia, Yuantao Xie, and Weijian Sun. 2020. FedED: Federated Learning via Ensemble Distillation for Medical Relation Extraction. In EMNLP. 2118--2128.

[41]

Akihito Taya, Takayuki Nishio, Masahiro Morikura, and Koji Yamamoto. 2021. Decentralized and Model-Free Federated Learning: Consensus-Based Distillation in Function Space. arXiv:2104.00352 (2021).

[42]

Hongyi Wang, Mikhail Yurochkin, Yuekai Sun, et al. 2020b. Federated Learning with Matched Averaging. In ICLR.

[43]

Jianyu Wang, Qinghua Liu, Hao Liang, Gauri Joshi, and H Vincent Poor. 2020a. Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in neural information processing systems, 7611--7623.

[44]

Xueyu Wu, Xin Yao, and Cho-Li Wang. 2021. FedSCR: Structure-Based Communication Reduction for Federated Learning. IEEE Trans. Parallel Distributed Syst., Vol. 32, 7 (2021), 1565--1577.

[45]

Chencheng Xu, Zhiwei Hong, Minlie Huang, and Tao Jiang. 2022. Acceleration of Federated Learning with Alleviated Forgetting in Local Training. arXiv:2203.02645 (2022).

[46]

Dezhong Yao, Wanning Pan, Yutong Dai, Yao Wan, Xiaofeng Ding, Hai Jin, Zheng Xu, and Lichao Sun. 2021. Local-Global Knowledge Distillation in Heterogeneous Federated Learning with Non-IID Data. arXiv:2107.00051 (2021).

[47]

Felix X. Yu, Ankit Singh Rawat, Aditya Krishna Menon, and Sanjiv Kumar. 2020. Federated Learning with Only Positive Labels. In ICML. 10946--10956.

[48]

Li Yuan, Francis E. H. Tay, Guilin Li, Tao Wang, and Jiashi Feng. 2020. Revisiting Knowledge Distillation via Label Smoothing Regularization. In CVPR. 3902--3910.

[49]

Lin Zhang, Li Shen, Liang Ding, Dacheng Tao, and Ling-Yu Duan. 2022. Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning. In CVPR. 10164--10173.

[50]

Zhuangdi Zhu, Junyuan Hong, and Jiayu Zhou. 2021. Data-Free Knowledge Distillation for Heterogeneous Federated Learning. In ICML. 12878--12889.

[51]

Weiming Zhuang, Yonggang Wen, and Shuai Zhang. 2021. Joint Optimization in Edge-Cloud Continuum for Federated Unsupervised Person Re-identification. In ACM Multimedia. 433--441.

[52]

Weiming Zhuang, Yonggang Wen, Xuesen Zhang, Xin Gan, Daiying Yin, Dongzhan Zhou, Shuai Zhang, and Shuai Yi. 2020. Performance Optimization of Federated Person Re-Identification via Benchmark Analysis. In ACM Multimedia. 955--963.

Cited By

Zhang PZhou YHu MWei XChen M(2025)CyclicFL: Efficient Federated Learning with Cyclic Model Pre-TrainingJournal of Circuits, Systems and Computers10.1142/S0218126625501658Online publication date: 5-Feb-2025
https://doi.org/10.1142/S0218126625501658
Mora ATenison IBellavista PRish ILarson K(2024)Knowledge distillation in federated learningProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/905(8188-8196)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/905
Ling TShi SWang HHu CWang DCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Federated Morozov Regularization for Shortcut Learning in Privacy Preserving Learning with Watermarked Image DataProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681480(4899-4908)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681480
Show More Cited By

Index Terms

Federated Learning with Label-Masking Distillation
1. Computing methodologies
  1. Distributed computing methodologies
    1. Distributed algorithms

Recommendations

Federated Few-shot Learning
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Federated Learning (FL) enables multiple clients to collaboratively learn a machine learning model without exchanging their own local data. In this way, the server can exploit the computational power of all clients and train the model on a larger set of ...
Inferring Class-Label Distribution in Federated Learning
AISec'22: Proceedings of the 15th ACM Workshop on Artificial Intelligence and Security

Federated Learning (FL) has become a popular distributed learning method for training classifiers by using data that are private to individual clients. The clients´ data are typically assumed to be confidential, but their heterogeneity and potential ...
Self-distillation and self-supervision for partial label learning
Abstract
As a main branch of weakly supervised learning paradigm, partial label learning (PLL) copes with the situation where each sample corresponds to ambiguous candidate labels containing the unknown true label. The primary difficulty of PLL lies in ...
Highlights
- The multi-task framework integrates self-supervision and self-distillation for PLL.
- Self-supervised module works as an auxiliary task to capture better representations.
- Self-distillation module is proposed by weightily aggregating ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

October 2023

9913 pages

ISBN:9798400701085

DOI:10.1145/3581783

General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada & MBZUAI, UAE
,
Tao Mei
HiDream.ai, China
,
Rita Cucchiara
University of Modena and Reggio Emilia, Italy
,
Program Chairs:
Marco Bertini
University of Florence, Italy
,
Diana Patricia Tobon Vallejo
Unversidad de Medellin, Colombia
,
Pradeep K. Atrey
University at Albany, State University of New York, USA
,
M. Shamim Hossain
M. Shamim Hossain (King Saud University, KSA

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

the National Key Research and Development Plan

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

October 29 - November 3, 2023

Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
866
Total Downloads

Downloads (Last 12 months)619
Downloads (Last 6 weeks)58

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang PZhou YHu MWei XChen M(2025)CyclicFL: Efficient Federated Learning with Cyclic Model Pre-TrainingJournal of Circuits, Systems and Computers10.1142/S0218126625501658Online publication date: 5-Feb-2025
https://doi.org/10.1142/S0218126625501658
Mora ATenison IBellavista PRish ILarson K(2024)Knowledge distillation in federated learningProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/905(8188-8196)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/905
Ling TShi SWang HHu CWang DCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Federated Morozov Regularization for Shortcut Learning in Privacy Preserving Learning with Watermarked Image DataProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681480(4899-4908)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681480
Zeng HXu MZhou TWu XKang JCai ZNiyato DCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)One-shot-but-not-degraded Federated LearningProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680715(11070-11079)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680715
Luo YZhu TLiu ZMao TChen ZPi HLin Y(2024)GANFAT: Robust federated adversarial learning with label distribution skewFuture Generation Computer Systems10.1016/j.future.2024.06.030160(711-723)Online publication date: Nov-2024
https://doi.org/10.1016/j.future.2024.06.030

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten