Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3528535.3565238acmconferencesArticle/Chapter ViewAbstractPublication PagesmiddlewareConference Proceedingsconference-collections
research-article
Open access

Aergia: leveraging heterogeneity in federated learning systems

Published: 08 November 2022 Publication History

Abstract

Federated Learning (FL) is a popular deep learning approach that prevents centralizing large amounts of data, and instead relies on clients that update a global model using their local datasets. Classical FL algorithms use a central federator that, for each training round, waits for all clients to send their model updates before aggregating them. In practical deployments, clients might have different computing powers and network capabilities, which might lead slow clients to become performance bottlenecks. Previous works have suggested to use a deadline for each learning round so that the federator ignores the late updates of slow clients, or so that clients send partially trained models before the deadline. To speed up the training process, we instead propose Aergia, a novel approach where slow clients (i) freeze the part of their model that is the most computationally intensive to train; (ii) train the unfrozen part of their model; and (iii) offload the training of the frozen part of their model to a faster client that trains it using its own dataset. The offloading decisions are orchestrated by the federator based on the training speed that clients report and on the similarities between their datasets, which are privately evaluated thanks to a trusted execution environment. We show through extensive experiments that Aergia maintains high accuracy and significantly reduces the training time under heterogeneous settings by up to 27% and 53% compared to FedAvg and TiFL, respectively.

References

[1]
Manoj Ghuhan Arivazhagan, Vinay Aggarwal, Aaditya Kumar Singh, and Sunav Choudhary. 2019. Federated Learning with Personalization Layers. CoRR abs/1912.00818 (2019). arXiv:1912.00818 http://arxiv.org/abs/1912.00818
[2]
Dmitrii Avdiukhin and Shiva Kasiviswanathan. 2021. Federated Learning under Arbitrary Communication Patterns. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18--24 July 2021, Virtual Event (July 18 - 24, 2021) (Proceedings of Machine Learning Research), Marina Meila and Tong Zhang (Eds.), Vol. 139. PMLR, 425--435. https://proceedings.mlr.press/v139/avdiukhin21a.html
[3]
Kallista A. Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloé Kiddon, Jakub Konečný, Stefano Mazzocchi, Brendan McMahan, Timon Van Overveldt, David Petrou, Daniel Ramage, and Jason Roselander. 2019. Towards Federated Learning at Scale: System Design. In Proceedings of Machine Learning and Systems 2019, MLSys 2019, Stanford, California, USA, March 31 - April 2, 2019, Ameet Talwalkar, Virginia Smith, and Matei Zaharia (Eds.). mlsys.org, 374--388. https://proceedings.mlsys.org/book/271.pdf
[4]
Kallista A. Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical Secure Aggregation for Privacy-Preserving Machine Learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, Dallas, Texas, USA, October 30 - November 03, 2017, Bhavani Thuraisingham, David Evans, Tal Malkin, and Dongyan Xu (Eds.). ACM, 1175--1191.
[5]
Andrew Brock, Theodore Lim, James M. Ritchie, and Nick Weston. 2017. FreezeOut: Accelerate Training by Progressively Freezing Layers. CoRR abs/1706.04983 (2017). arXiv:1706.04983 http://arxiv.org/abs/170604983
[6]
Zheng Chai, Ahsan Ali, Syed Zawad, Stacey Truex, Ali Anwar, Nathalie Baracaldo, Yi Zhou, Heiko Ludwig, Feng Yan, and Yue Cheng. 2020. TiFL: A Tier-based Federated Learning System. In HPDC '20: The 29th International Symposium on High-Performance Parallel and Distributed Computing, Stockholm, Sweden, June 23--26, 2020, Manish Parashar, Vladimir Vlassov, David E. Irwin, and Kathryn Mohror (Eds.). ACM, 125--136.
[7]
Chen Chen, Hong Xu, Wei Wang, Baochun Li, Bo Li, Li Chen, and Gong Zhang. 2021. Communication-Efficient Federated Learning with Adaptive Parameter Freezing. In 41st IEEE International Conference on Distributed Computing Systems, ICDCS 2021, Washington DC, USA, July 7--10, 2021. IEEE, 1--11.
[8]
Victor Costan and Srinivas Devadas. 2016. Intel SGX Explained. IACR Cryptol. ePrint Arch. (2016), 86. http://eprint.iacr.org/2016/086
[9]
Bart Cox, Jeroen Galjaard, Amirmasoud Ghiassi, Robert Birke, and Lydia Y. Chen. 2021. Masa: Responsive Multi-DNN Inference on the Edge. In 19th IEEE International Conference on Pervasive Computing and Communications, PerCom 2021, Kassel, Germany, March 22--26, 2021. IEEE, 1--10.
[10]
Shifu Dong, Deze Zeng, Lin Gu, and Song Guo. 2020. Offloading Federated Learning Task to Edge Computing with Trust Execution Environment. In 17th IEEE International Conference on Mobile Ad Hoc and Sensor Systems, MASS 2020, Delhi, India, December 10--13, 2020. IEEE, 491--496.
[11]
Ronald L. Graham. 1969. Bounds on Multiprocessing Timing Anomalies. SIAM Journal of Applied Mathematics 17, 2 (1969), 416--429.
[12]
Kevin Hsieh, Amar Phanishayee, Onur Mutlu, and Phillip B. Gibbons. 2020. The Non-IID Data Quagmire of Decentralized Machine Learning. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13--18 July 2020, Virtual Event (Proceedings of Machine Learning Research), Vol. 119. PMLR, 4387--4398. http://proceedings.mlr.press/v119/hsieh20a.html
[13]
Zhongming Ji, Li Chen, Nan Zhao, Yunfei Chen, Guo Wei, and F. Richard Yu. 2021. Computation Offloading for Edge-Assisted Federated Learning. IEEE Trans. Veh. Technol. 70, 9 (2021), 9330--9344.
[14]
Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista A. Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D'Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaïd Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson, Justin Hsu, Martin Jaggi, Tara Javidi, Gauri Joshi, Mikhail Khodak, Jakub Konečný, Aleksandra Korolova, Farinaz Koushanfar, Sanmi Koyejo, Tancrède Lepoint, Yang Liu, Prateek Mittal, Mehryar Mohri, Richard Nock, Ayfer Özgür, Rasmus Pagh, Hang Qi, Daniel Ramage, Ramesh Raskar, Mariana Raykova, Dawn Song, Weikang Song, Sebastian U. Stich, Ziteng Sun, Ananda Theertha Suresh, Florian Tramèr, Praneeth Vepakomma, Jianyu Wang, Li Xiong, Zheng Xu, Qiang Yang, Felix X. Yu, Han Yu, and Sen Zhao. 2021. Advances and Open Problems in Federated Learning. Found. Trends Mach. Learn. 14, 1--2 (2021), 1--210.
[15]
Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank J. Reddi, Sebastian U. Stich, and Ananda Theertha Suresh. 2020. SCAFFOLD: Stochastic Controlled Averaging for Federated Learning. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13--18 July 2020, Virtual Event (Proceedings of Machine Learning Research), Vol. 119. PMLR, 5132--5143. http://proceedings.mlr.press/v119/karimireddy20a.html
[16]
Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Technical Report 0. University of Toronto, Toronto, Ontario.
[17]
Eugene L. Lawler, Jan Karel Lenstra, Alexander H. G. Rinnooy Kan, and David B. Shmoys. 1993. Chapter 9 Sequencing and scheduling: Algorithms and complexity. In Logistics of Production and Inventory, Stephen C. Graves, Alexander H. G. Rinnooy Kan, and Paul Herbert Zipkin (Eds.). Handbooks in Operations Research and Management Science, Vol. 4. North-Holland, 445--522.
[18]
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.
[19]
Jaewook Lee, Haneul Ko, and Sangheon Pack. 2022. Adaptive Deadline Determination for Mobile Device Selection in Federated Learning. IEEE Trans. Veh. Technol. 71, 3 (2022), 3367--3371.
[20]
Li Li, Haoyi Xiong, Zhishan Guo, Jun Wang, and Cheng-Zhong Xu. 2019. SmartPC: Hierarchical Pace Control in Real-Time Federated Learning System. In IEEE Real-Time Systems Symposium, RTSS 2019, Hong Kong, SAR, China, December 3--6, 2019. IEEE, 406--418.
[21]
Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. 2020. Federated Optimization in Heterogeneous Networks. In Proceedings of Machine Learning and Systems 2020, MLSys 2020, Austin, Texas, USA, March 2--4, 2020, Inderjit S. Dhillon, Dimitris S. Papailiopoulos, and Vivienne Sze (Eds.). mlsys.org, 429--450. https://proceedings.mlsys.org/book/316.pdf
[22]
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Agüera y Arcas. 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, 20--22 April 2017, Fort Lauderdale, Florida, USA (Proceedings of Machine Learning Research), Aarti Singh and Xiaojin (Jerry) Zhu (Eds.), Vol. 54. PMLR, 1273--1282. http://proceedings.mlr.press/v54/mcmahan17a.html
[23]
H. Brendan McMahan, Eider Moore, Daniel Ramage, and Blaise Agüera y Arcas. 2016. Federated Learning of Deep Networks using Model Averaging. CoRR abs/1602.05629 (2016). arXiv:1602.05629 http://arxiv.org/abs/1602.05629
[24]
Takayuki Nishio and Ryo Yonetani. 2019. Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge. In 2019 IEEE International Conference on Communications, ICC 2019, Shanghai, China, May 20--24, 2019. IEEE, 1--7.
[25]
Maxime Oquab, Léon Bottou, Ivan Laptev, and Josef Sivic. 2014. Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, Ohio, USA, June 23--28, 2014. IEEE Computer Society, 1717--1724.
[26]
Róbert Ormándi, István Hegedüs, and Márk Jelasity. 2013. Gossip learning with linear models on fully distributed data. Concurr. Comput. Pract. Exp. 25, 4 (2013), 556--571.
[27]
Jungwuk Park, Dong-Jun Han, Minseok Choi, and Jaekyun Moon. 2021. Sageflow: Robust Federated Learning against Both Stragglers and Adversaries. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6--14, 2021, virtual, Marc'Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan (Eds.). 840--851. https://proceedings.neurips.cc/paper/2021/hash/076a8133735eb5d7552dc195b125a454-Abstract.html
[28]
Yossi Rubner, Carlo Tomasi, and Leonidas J. Guibas. 1998. A Metric for Distributions with Applications to Image Databases. In Proceedings of the Sixth International Conference on Computer Vision (ICCV-98), Bombay, India, January 4--7, 1998. IEEE Computer Society, 59--66.
[29]
Shizhao Sun, Wei Chen, Liwei Wang, Xiaoguang Liu, and Tie-Yan Liu. 2016. On the Depth of Deep Neural Networks: A Theoretical View. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12--17, 2016, Phoenix, Arizona, USA, Dale Schuurmans and Michael P. Wellman (Eds.). AAAI Press, 2066--2072. http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12073
[30]
Yichuan Tang. 2013. Deep Learning using Support Vector Machines. CoRR abs/1306.0239 (2013). arXiv:1306.0239 http://arxiv.org/abs/1306.0239
[31]
Chia-che Tsai, Donald E. Porter, and Mona Vij. 2017. Graphene-SGX: A Practical Library OS for Unmodified Applications on SGX. In 2017 USENIX Annual Technical Conference, USENIX ATC 2017, Santa Clara, California, USA, July 12--14, 2017, Dilma Da Silva and Bryan Ford (Eds.). USENIX Association, 645--658. https://www.usenix.org/conference/atc17/technical-sessions/presentation/tsai
[32]
Andreas Veit and Serge J. Belongie. 2018. Convolutional Networks with Adaptive Inference Graphs. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8--14, 2018, Proceedings, Part I (Lecture Notes in Computer Science), Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.), Vol. 11205. Springer, 3--18.
[33]
Jianyu Wang, Qinghua Liu, Hao Liang, Gauri Joshi, and H. Vincent Poor. 2020. Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6--12, 2020, virtual, Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). 7611--7623. https://proceedings.neurips.cc/paper/2020/hash/564127c03caab942e503ee6f810f54fd-Abstract.html
[34]
Carole-Jean Wu, David Brooks, Kevin Chen, Douglas Chen, Sy Choudhury, Marat Dukhan, Kim M. Hazelwood, Eldad Isaac, Yangqing Jia, Bill Jia, Tommer Leyvand, Hao Lu, Yang Lu, Lin Qiao, Brandon Reagen, Joe Spisak, Fei Sun, Andrew Tulloch, Peter Vajda, Xiaodong Wang, Yanghan Wang, Bram Wasti, Yiming Wu, Ran Xian, Sungjoo Yoo, and Peizhao Zhang. 2019. Machine Learning at Facebook: Understanding Inference at the Edge. In 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019, Washington, DC, USA, February 16--20, 2019. IEEE, 331--344.
[35]
Di Wu, Rehmat Ullah, Paul Harvey, Peter Kilpatrick, Ivor T. A. Spence, and Blesson Varghese. 2021. FedAdapt: Adaptive Offloading for IoT Devices in Federated Learning. CoRR abs/2107.04271 (2021). arXiv:2107.04271 https://arxiv.org/abs/2107.04271
[36]
Zuxuan Wu, Tushar Nagarajan, Abhishek Kumar, Steven Rennie, Larry S. Davis, Kristen Grauman, and Rogério Schmidt Feris. 2018. BlockDrop: Dynamic Inference Paths in Residual Networks. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, Utah, USA, June 18--22, 2018. Computer Vision Foundation / IEEE Computer Society, 8817--8826.
[37]
Xueli Xiao, Thosini Bamunu Mudiyanselage, Chunyan Ji, Jie Hu, and Yi Pan. 2019. Fast Deep Learning Training through Intelligently Freezing Layers. In 2019 International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), iThings/GreenCom/CPSCom/SmartData 2019, Atlanta, Georgia, USA, July 14--17, 2019. IEEE, 1225--1232.
[38]
Jie Zhang, Song Guo, Xiaosong Ma, Haozhao Wang, Wenchao Xu, and Feijie Wu. 2021. Parameterized Knowledge Transfer for Personalized Federated Learning. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6--14, 2021, virtual, Marc'Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan (Eds.). 10092--10104. https://proceedings.neurips.cc/paper/2021/hash/5383c7318a3158b9bc261d0b6996f7c2-Abstract.html
[39]
Zhilu Zhang and Mert R. Sabuncu. 2018. Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3--8, 2018, Montréal, Canada, Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett (Eds.). 8792--8802. https://proceedings.neurips.cc/paper/2018/hash/f2925f97bc13ad2852a7a551802feea0-Abstract.html
[40]
Yue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon Civin, and Vikas Chandra. 2018. Federated Learning with Non-IID Data. CoRR abs/1806.00582 (2018). arXiv:1806.00582 http://arxiv.org/abs/1806.00582

Cited By

View all
  • (2024)Apodotiko: Enabling Efficient Serverless Federated Learning in Heterogeneous Environments2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid59990.2024.00032(206-215)Online publication date: 6-May-2024
  • (2024)A comprehensive survey of federated transfer learning: challenges, methods and applicationsFrontiers of Computer Science10.1007/s11704-024-40065-x18:6Online publication date: 23-Jul-2024
  • (2023)FLIPSProceedings of the 24th International Middleware Conference10.1145/3590140.3629123(301-315)Online publication date: 27-Nov-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
Middleware '22: Proceedings of the 23rd ACM/IFIP International Middleware Conference
November 2022
110 pages
ISBN:9781450393409
DOI:10.1145/3528535
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 November 2022

Check for updates

Author Tags

  1. federated learning
  2. stragglers
  3. task offloading

Qualifiers

  • Research-article

Conference

Middleware '22
Sponsor:
Middleware '22: 23rd International Middleware Conference
November 7 - 11, 2022
QC, Quebec, Canada

Acceptance Rates

Middleware '22 Paper Acceptance Rate 8 of 21 submissions, 38%;
Overall Acceptance Rate 203 of 948 submissions, 21%

Upcoming Conference

MIDDLEWARE '24
25th International Middleware Conference
December 2 - 6, 2024
Hong Kong , Hong Kong

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)233
  • Downloads (Last 6 weeks)27
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Apodotiko: Enabling Efficient Serverless Federated Learning in Heterogeneous Environments2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid59990.2024.00032(206-215)Online publication date: 6-May-2024
  • (2024)A comprehensive survey of federated transfer learning: challenges, methods and applicationsFrontiers of Computer Science10.1007/s11704-024-40065-x18:6Online publication date: 23-Jul-2024
  • (2023)FLIPSProceedings of the 24th International Middleware Conference10.1145/3590140.3629123(301-315)Online publication date: 27-Nov-2023
  • (2023)Hwamei: A Learning-Based Synchronization Scheme for Hierarchical Federated Learning2023 IEEE 43rd International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS57875.2023.00047(534-544)Online publication date: Jul-2023
  • (2023)FedCime: An Efficient Federated Learning Approach For Clients in Mobile Edge Computing2023 IEEE International Conference on Edge Computing and Communications (EDGE)10.1109/EDGE60047.2023.00042(215-220)Online publication date: Jul-2023
  • (2023)AdaptPSOFL: Adaptive Particle Swarm Optimization-Based Layer Offloading Framework for Federated LearningFourth International Conference on Image Processing and Capsule Networks10.1007/978-981-99-7093-3_40(597-610)Online publication date: 18-Nov-2023
  • (2022)A Federated Transfer Learning Framework Based on Heterogeneous Domain Adaptation for Students’ Grades ClassificationApplied Sciences10.3390/app12211071112:21(10711)Online publication date: 22-Oct-2022

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media