Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3423211.3425685acmconferencesArticle/Chapter ViewAbstractPublication PagesmiddlewareConference Proceedingsconference-collections
research-article

FLeet: Online Federated Learning via Staleness Awareness and Performance Prediction

Published: 11 December 2020 Publication History

Abstract

Federated Learning (FL) is very appealing for its privacy benefits: essentially, a global model is trained with updates computed on mobile devices while keeping the data of users local. Standard FL infrastructures are however designed to have no energy or performance impact on mobile devices, and are therefore not suitable for applications that require frequent (online) model updates, such as news recommenders.
This paper presents FLeet, the first Online FL system, acting as a middleware between the Android OS and the machine learning application. FLeet combines the privacy of Standard FL with the precision of online learning thanks to two core components: (i) I-Prof, a new lightweight profiler that predicts and controls the impact of learning tasks on mobile devices, and (ii) AdaSGD, a new adaptive learning algorithm that is resilient to delayed updates.
Our extensive evaluation shows that Online FL, as implemented by FLeet, can deliver a 2.3× quality boost compared to Standard FL, while only consuming 0.036% of the battery per day. I-Prof can accurately control the impact of learning tasks by improving the prediction accuracy up to 3.6× (computation time) and up to 19× (energy). AdaSGD outperforms alternative FL approaches by 18.4% in terms of convergence speed on heterogeneous data.

References

[1]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. TensorFlow: A system for large-scale machine learning. In OSDI. 265--283.
[2]
Martín Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In CCS. ACM, 308--318.
[3]
Haider Al-Lawati and Stark C Draper. 2020. Gradient Delay Analysis in Asynchronous Distributed Optimization. In ICASSP. IEEE, 4207--4211.
[4]
Majid Altamimi, Atef Abdrabou, Kshirasagar Naik, and Amiya Nayak. 2015. Energy cost models of smartphones for task offloading to the cloud. TETC 3, 3 (2015), 384--398.
[5]
Arm. 2020. SIMD ISAs | Neon. https://developer.arm.com/architectures/instruction-sets/simd-isas/neon.
[6]
Arm Mali Graphics Processing Units (GPUs) 2020. https://developer.arm.com/ip-products/graphics-and-multimedia/mali-gpus.
[7]
AWS Device Farm 2020. https://aws.amazon.com/device-farm/.
[8]
Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Konecny, Stefano Mazzocchi, H Brendan McMahan, et al. 2019. Towards Federated Learning at Scale: System Design. Proceedings of the 2nd SysML Conference (2019).
[9]
Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical secure aggregation for privacy-preserving machine learning. In CCS. ACM, 1175--1191.
[10]
Aaron Carroll, Gernot Heiser, et al. 2010. An Analysis of Power Consumption in a Smartphone. In USENIX ATC, Vol. 14. Boston, MA, 21--21.
[11]
Fei Chen, Zhenhua Dong, Zhenguo Li, and Xiuqiang He. 2018. Federated Meta-Learning for Recommendation. arXiv preprint arXiv:1802.07876 (2018).
[12]
Trishul M Chilimbi, Yutaka Suzue, Johnson Apacible, and Karthik Kalyanaraman. 2014. Project Adam: Building an Efficient and Scalable Deep Learning Training System. In OSDI, Vol. 14. 571--582.
[13]
Shaiful Alam Chowdhury, Luke N Kumar, Md Toukir Imam, Mohomed Shazan Mohomed Jabbar, Varun Sapra, Karan Aggarwal, Abram Hindle, and Russell Greiner. 2015. A system-call based model of software energy consumption without hardware instrumentation. In IGSC. 1--6.
[14]
David Chu, Nicholas D Lane, Ted Tsung-Te Lai, Cong Pang, Xiangying Meng, Qing Guo, Fan Li, and Feng Zhao. 2011. Balancing energy, latency and accuracy for mobile sensor data classification. In SenSys. ACM, 54--67.
[15]
Gregory Cohen, Saeed Afshar, Jonathan Tapson, and André van Schaik. 2017. EMNIST: an extension of MNIST to handwritten letters. arXiv preprint arXiv:1702.05373 (2017).
[16]
Koby Crammer, Ofer Dekel, Joseph Keshet, Shai Shalev-Shwartz, and Yoram Singer. 2006. Online passive-aggressive algorithms. JMLR 7, Mar (2006), 551--585.
[17]
Crushh. 2017. Average text message length. https://crushhapp.com/blog/k-wrap-it-up-mom.
[18]
Eduardo Cuervo, Aruna Balasubramanian, Dae-ki Cho, Alec Wolman, Stefan Saroiu, Ranveer Chandra, and Paramvir Bahl. 2010. MAUI: making smartphones last longer with code offload. In MobiSys. ACM, 49--62.
[19]
Henggang Cui, James Cipar, Qirong Ho, Jin Kyu Kim, Seunghak Lee, Abhimanu Kumar, Jinliang Wei, Wei Dai, Gregory R Ganger, Phillip B Gibbons, et al. 2014. Exploiting Bounded Staleness to Speed Up Big Data Analytics. In USENIX ATC. 37--48.
[20]
Georgios Damaskinos, El Mahdi El Mhamdi, Rachid Guerraoui, Arsany Guirguis, and Sébastien Louis Alexandre Rouault. 2019. Aggregathor: Byzantine machine learning via robust gradient aggregation. In Conference on Machine Learning and Systems (SysML /MLSys).
[21]
Georgios Damaskinos, El Mahdi El Mhamdi, Rachid Guerraoui, Rhicheek Patra, Mahsa Taziki, et al. 2018. Asynchronous Byzantine Machine Learning (the case of SGD). In ICML. 1153--1162.
[22]
Deeplearning4j. 2020. DL4J. https://deeplearning4j.org/.
[23]
Bhuwan Dhingra, Zhong Zhou, Dylan Fitzpatrick, Michael Muehl, and William W Cohen. 2016. Tweet2vec: Character-based distributed representations for social media. arXiv preprint arXiv.1605.03481 (2016).
[24]
Yi Ding, Nikita Mishra, and Henry Hoffmann. 2019. Generative and Multi-Phase Learning for Computer Systems Optimization. In ISCA (ISCA '19). Association for Computing Machinery, New York, NY, USA, 39--52. https://doi.org/10.1145/3307650.3326633
[25]
Sanghamitra Dutta, Viveck Cadambe, and Pulkit Grover. 2016. Shortdot: Computing large linear transforms distributedly using coded short dot products. In NIPS. 2100--2108.
[26]
Cynthia Dwork, Aaron Roth, et al. 2014. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science 9, 3-4 (2014), 211--407.
[27]
EsotericSoftware. 2020. Kryo. https://github.com/EsotericSoftware/kryo/.
[28]
Yuyun Gong and Qi Zhang. 2016. Hashtag Recommendation Using Attention-Based Convolutional Neural Network. In IJCAI. 2782--2788.
[29]
Google. 2020. TensorFlow - Federated Learning. https://www.tensorflow org/federated/federated_learning.
[30]
Google. 2020. Tensorflow text classification. https://www.tensorflow.org/tutorials/text/text_classification_rnn#create_the_model
[31]
US government. 2018. California Consumer Privacy Act of 2018 (CCPA). https://leginfo.legislature.ca.gov/faces/billTextClient.xhtml?bill_id=201720180AB375.
[32]
P Greenhalgh. 2013. big. LITTLE Technology: The Future of Mobile. ARM, White paper (2013).
[33]
Grid5000. 2020. Grid5000. https://www.grid5000.fr/.
[34]
Matthew Halpern, Yuhao Zhu, and Vijay Janapa Reddi. 2016. Mobile cpu's rise to power: Quantifying the impact of generational mobile cpu design trends on performance, energy, and user satisfaction. In HPCA. IEEE, 64--76.
[35]
Shuai Hao, Ding Li, William GJ Halfond, and Ramesh Govindan. 2013. Estimating mobile application energy consumption using program analysis. In ICSE. IEEE Press, 92--101.
[36]
Andrew Hard, Kanishka Rao, Rajiv Mathews, Swaroop Ramaswamy, Françoise Beaufays, Sean Augenstein, Hubert Eichner, Chloé Kiddon, and Daniel Ramage. 2018. Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604 (2018).
[37]
How fast is 4G? 2020. https://www.4g.co.uk/how-fast-is-4g/.
[38]
Eunjeong Jeong, Seungeun Oh, Hyesung Kim, Jihong Park, Mehdi Bennis, and Seong-Lyun Kim. 2018. Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation under Non-IID Private Data. arXiv preprint arXiv:1811.11479 (2018).
[39]
Jiawei Jiang, Bin Cui, Ce Zhang, and Lele Yu. 2017. Heterogeneity-aware Distributed Parameter Servers. In SIGMOD. 463--478.
[40]
Yiping Kang, Johann Hauswald, Cao Gao, Austin Rovinski, Trevor Mudge, Jason Mars, and Lingjia Tang. 2017. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. In ASPLOS. 615--629.
[41]
Jakub Konečny, H Brendan McMahan, Daniel Ramage, and Peter Richtárik. 2016. Federated optimization: distributed machine learning for on-device intelligence. arXiv preprint arXiv:1610.02527 (2016).
[42]
Dominik Kowald, Subhash Chandra Pujari, and Elisabeth Lex. 2017. Temporal effects on hashtag reuse in twitter: A cognitive-inspired hashtag recommendation approach. In WWW. 1401--1410.
[43]
Alex Krizhevsky. 2009. Cifar dataset. https://www.cs.toronto.edu/~kriz/cifar.html.
[44]
Yongin Kwon, Sangmin Lee, Hayoon Yi, Donghyun Kwon, Seungjun Yang, Byung-Gon Chun, Ling Huang, Petros Maniatis, Mayur Naik, and Yunheung Paek. 2013. Mantis: Automatic performance prediction for smartphone applications. In USENIX ATC. 297--308.
[45]
Su Mon Kywe, Tuan-Anh Hoang, Ee-Peng Lim, and Feida Zhu. 2012. On recommending hashtags in twitter networks. In International Conference on Social Informatics. Springer, 337--350.
[46]
Primate Labs. 2020. Matrix multiplication benchmark. https://browser.geekbench.com.
[47]
Yann Lecun. 1998. MNIST dataset. http://yann.lecun.com/exdb/mnist/.
[48]
Kangwook Lee, Maximilian Lam, Ramtin Pedarsani, Dimitris Papailiopoulos, and Kannan Ramchandran. 2017. Speeding up distributed machine learning using codes. IEEE Transactions on Information Theory 64, 3 (2017), 1514--1529.
[49]
Mu Li, David G Andersen, Jun Woo Park, Alexander J Smola, Amr Ahmed, Vanja Josifovski, James Long, Eugene J Shekita, and Bor-Yiing Su. 2014. Scaling distributed machine learning with the parameter server. In OSDI. 583--598.
[50]
Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. 2018. Federated optimization for heterogeneous networks. arXiv preprint arXiv.1812.06127 (2018).
[51]
Yan Liu and Jack YB Lee. 2015. An empirical study of throughput prediction in mobile data networks. In GLOBECOM. IEEE, 1--6.
[52]
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data. In AISTATS. 1273--1282.
[53]
Vox media. 2013. NSA's PRISM. https://www.theverge.com/2013/7/17/4517480/nsa-spying-prism-surveillance-cheat-sheet.
[54]
Vox media. 2018. The Facebook and Cambridge Analytica scandal. https://www.vox.com/policy-and-politics/2018/3/23/17151916/facebook-cambridge-analytica-trump-diagram.
[55]
Gilad Mishne, Jeff Dalton, Zhenghua Li, Aneesh Sharma, and Jimmy Lin. 2013. Fast data in the era of big data: Twitter's real-time related query suggestion architecture. In SIGMOD. ACM, 1147--1158.
[56]
Nikita Mishra, Connor Imes, John D Lafferty, and Henry Hoffmann. 2018. CALOREE: Learning control for predictable latency and low energy. ASPLOS 53, 2 (2018), 184--198.
[57]
Nikita Mishra, Huazhe Zhang, John D Lafferty, and Henry Hoffmann. 2015. A probabilistic graphical model-based approach for minimizing energy under performance constraints. ASPLOS 50, 4 (2015), 267--281.
[58]
Ioannis Mitliagkas, Ce Zhang, Stefan Hadjis, and Christopher Ré. 2016. Asynchrony begets momentum, with an application to deep learning. In Annual Allerton Conference on Communication, Control, and Computing. IEEE, 997--1004.
[59]
Radhika Mittal, Aman Kansal, and Ranveer Chandra. 2012. Empowering developers to estimate app energy consumption. In MobiCom. ACM, 317--328.
[60]
Behnam Neyshabur, Ruslan R Salakhutdinov, and Nati Srebro. 2015. Path-sgd: Path-normalized optimization in deep neural networks. In NIPS. 2422--2430.
[61]
Takayuki Nishio and Ryo Yonetani. 2019. Client selection for federated learning with heterogeneous resources in mobile edge. In ICC. IEEE, 1--7.
[62]
NVIDIA. 2020. CUDA GPUs | NVIDIA Developer. https://developer.nvidia.com/cuda-gpus.
[63]
Eriko Otsuka, Scott A Wallace, and David Chiu. 2014. Design and evaluation of a twitter hashtag recommendation system. In IDEAS. 330--333.
[64]
Xue Ouyang, Peter Garraghan, David McKee, Paul Townend, and Jie Xu. 2016. Straggler detection in parallel computing systems through dynamic threshold calculation. In AINA. IEEE, 414--421.
[65]
Tien-Dat Phan, Guillaume Pallez, Shadi Ibrahim, and Padma Raghavan. 2019. A new framework for evaluating straggler detection mechanisms in mapreduce. TOMPECS 4, 3 (2019), 1--23.
[66]
Feng Qian, Zhaoguang Wang, Alexandre Gerber, Zhuoqing Mao, Subhabrata Sen, and Oliver Spatscheck. 2011. Profiling resource usage for mobile applications: a cross-layer approach. In MobiSys. ACM, 321--334.
[67]
Aurick Qiao, Abutalib Aghayev, Weiren Yu, Haoyang Chen, Qirong Ho, Garth A Gibson, and Eric P Xing. 2018. Litz: Elastic framework for high-performance distributed machine learning. In USENIX ATC. 631--644.
[68]
Qualcomm. 2020. Adreno™ Graphics Processing Units. https://developer.qualcomm.com/software/adreno-gpu-sdk/gpu.
[69]
Text request. 2016. Average text messages per day. https://www.textrequest.com/blog/how-many-texts-people-send-per-day/.
[70]
Theo Ryffel, Andrew Trask, Morten Dahl, Bobby Wagner, Jason Mancuso, Daniel Rueckert, and Jonathan Passerat-Palmbach. 2018. A generic framework for privacy preserving deep learning. arXiv preprint arXiv.1811.04017 (2018).
[71]
S20.ai. 2020. S20.ai. https://www.s20.ai/.
[72]
Virginia Smith, Chao-Kai Chiang, Maziar Sanjabi, and Ameet S Talwalkar. 2017. Federated multi-task learning. In NIPS. 4424--4434.
[73]
Snips. 2020. Snips - Using Voice to Make Technology Disappear. https://snips.ai/.
[74]
Statista. 2018. Percentage of all global web pages served to mobile phones from 2009 to 2018. https://www.statista.com/statistics/241462/global-mobile-phone-website-traffic-share/.
[75]
Tweepy 2020. https://tweepy.readthedocs.io/en/latest/.
[76]
European Union. 2016. Regulation 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (GDPR). https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32016R0679.
[77]
Shiqiang Wang, Tiffany Tuor, Theodoros Salonidis, Kin K Leung, Christian Makaya, Ting He, and Kevin Chan. 2019. Adaptive federated learning in resource constrained edge computing systems. IEEE Journal on Selected Areas in Communications 37, 6 (2019), 1205--1221.
[78]
Zhibo Wang, Mengkai Song, Zhifei Zhang, Yang Song, Qian Wang, and Hairong Qi. 2019. Beyond inferring class representatives: Userlevel privacy leakage from federated learning. In INFOCOM. IEEE, 2512--2520.
[79]
Wikipedia. 2019. Bhattacharyya coefficient. https://en.wikipedia.org/wiki/Bhattacharyya_distance.
[80]
Xi Wu, Fengan Li, Arun Kumar, Kamalika Chaudhuri, Somesh Jha, and Jeffrey Naughton. 2017. Bolt-on differential privacy for scalable stochastic gradient descent-based analytics. In SIGMOD. 1307--1322.
[81]
Eric P Xing, Qirong Ho, Wei Dai, Jin-Kyu Kim, Jinliang Wei, Seunghak Lee, Xun Zheng, Pengtao Xie, Abhimanu Kumar, and Yaoliang Yu. 2015. Petuum: A New Platform for Distributed Machine Learning on Big Data. In KDD. 1335--1344.
[82]
Timothy Yang, Galen Andrew, Hubert Eichner, Haicheng Sun, Wei Li, Nicholas Kong, Daniel Ramage, and Françoise Beaufays. 2018. Applied federated learning: Improving google keyboard query suggestions. arXiv preprint arXiv.1812.02903 (2018).
[83]
Chanmin Yoon, Dongwon Kim, Wonwoo Jung, Chulkoo Kang, and Hojung Cha. 2012. AppScope: Application Energy Metering Framework for Android Smartphone Using Kernel Activity Monitoring. In USENIX ATC, Vol. 12. 1--14.
[84]
Mikhail Yurochkin, Mayank Agarwal, Soumya Ghosh, Kristjan Greenewald, Nghia Hoang, and Yasaman Khazaeni. 2019. Bayesian Non-parametric Federated Learning of Neural Networks. In ICML. 7252--7261.
[85]
Wei Zhang, Suyog Gupta, Xiangru Lian, and Ji Liu. 2016. Staleness-aware async-sgd for distributed deep learning. In IJCAI. 2350--2356.
[86]
Yue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon Civin, and Vikas Chandra. 2018. Federated learning with non-iid data. arXiv preprint arXiv.1806.00582 (2018).

Cited By

View all
  • (2024)Dynamic Client Clustering, Bandwidth Allocation, and Workload Optimization for Semi-Synchronous Federated LearningElectronics10.3390/electronics1323458513:23(4585)Online publication date: 21-Nov-2024
  • (2024)Location Leakage in Federated Signal MapsIEEE Transactions on Mobile Computing10.1109/TMC.2023.333203423:6(6936-6953)Online publication date: Jun-2024
  • (2023)Latency-Aware Semi-Synchronous Client Selection and Model Aggregation for Wireless Federated LearningFuture Internet10.3390/fi1511035215:11(352)Online publication date: 26-Oct-2023
  • Show More Cited By

Index Terms

  1. FLeet: Online Federated Learning via Staleness Awareness and Performance Prediction

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      Middleware '20: Proceedings of the 21st International Middleware Conference
      December 2020
      455 pages
      ISBN:9781450381536
      DOI:10.1145/3423211
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 11 December 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Badges

      Author Tags

      1. asynchronous gradient descent
      2. federated learning
      3. mobile Android devices
      4. online learning
      5. profiling

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      Middleware '20
      Sponsor:
      Middleware '20: 21st International Middleware Conference
      December 7 - 11, 2020
      Delft, Netherlands

      Acceptance Rates

      Overall Acceptance Rate 203 of 948 submissions, 21%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 08 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Dynamic Client Clustering, Bandwidth Allocation, and Workload Optimization for Semi-Synchronous Federated LearningElectronics10.3390/electronics1323458513:23(4585)Online publication date: 21-Nov-2024
      • (2024)Location Leakage in Federated Signal MapsIEEE Transactions on Mobile Computing10.1109/TMC.2023.333203423:6(6936-6953)Online publication date: Jun-2024
      • (2023)Latency-Aware Semi-Synchronous Client Selection and Model Aggregation for Wireless Federated LearningFuture Internet10.3390/fi1511035215:11(352)Online publication date: 26-Oct-2023
      • (2023)Falkor: Federated Learning Secure Aggregation Powered by AESCTR GPU ImplementationProceedings of the 11th Workshop on Encrypted Computing & Applied Homomorphic Cryptography10.1145/3605759.3625261(11-22)Online publication date: 26-Nov-2023
      • (2023)Scheduling Algorithms for Federated Learning With Minimal Energy ConsumptionIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.324083334:4(1215-1226)Online publication date: Apr-2023
      • (2023)HiFlash: Communication-Efficient Hierarchical Federated Learning With Adaptive Staleness Control and Heterogeneity-Aware Client-Edge AssociationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.323804934:5(1560-1579)Online publication date: May-2023
      • (2023)SpongeTraining: Achieving High Efficiency and Accuracy for Wireless Edge-Assisted Online Distributed LearningIEEE Transactions on Mobile Computing10.1109/TMC.2022.315464422:8(4930-4945)Online publication date: 1-Aug-2023
      • (2023)Graph Federated Learning for CIoT Devices in Smart Home ApplicationsIEEE Internet of Things Journal10.1109/JIOT.2022.322872710:8(7062-7079)Online publication date: 15-Apr-2023
      • (2023)Heterogeneous Federated Learning for Balancing Job Completion Time and Model Accuracy2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00079(562-569)Online publication date: Jan-2023
      • (2023)Dynamic Scheduling For Federated Edge Learning With Streaming Data2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)10.1109/ICASSPW59220.2023.10193322(1-5)Online publication date: 4-Jun-2023
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media