Article

Free access

Federated multi-task learning

Authors:

Virginia Smith,

Chao-Kai Chiang,

Maziar Sanjabi,

Ameet TalwalkarAuthors Info & Claims

NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems

Pages 4427 - 4437

Published: 04 December 2017 Publication History

PDF eReader Publisher Site

Abstract

Federated learning poses new statistical and systems challenges in training machine learning models over distributed networks of devices. In this work, we show that multi-task learning is naturally suited to handle the statistical challenges of this setting, and propose a novel systems-aware optimization method, MOCHA, that is robust to practical systems issues. Our method and theory for the first time consider issues of high communication cost, stragglers, and fault tolerance for distributed multi-task learning. The resulting method achieves significant speedups compared to alternatives in the federated setting, as we demonstrate through simulations on real-world federated datasets.

References

[1]

A. Ahmed, A. Das, and A. J. Smola. Scalable hierarchical multitask learning algorithms for conversion optimization in display advertising. In Conference on Web Search and Data Mining, 2014.

Digital Library

[2]

R. K. Ando and T. Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research, 6:1817-1853, 2005.

Digital Library

[3]

D. Anguita, A. Ghio, L. Oneto, X. Parra, and J. L. Reyes-Ortiz. A public domain dataset for human activity recognition using smartphones. In European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2013.

[4]

A. Argyriou, T. Evgeniou, and M. Pontil. Multi-task feature learning. In Neural Information Processing Systems, 2007.

Digital Library

[5]

A. Argyriou, T. Evgeniou, and M. Pontil. Convex multi-task feature learning. Machine Learning, 73(3):243-272, 2008.

Digital Library

[6]

Ö. Aslan, X. Zhang, and D. Schuurmans. Convex deep learning via normalized kernels. In Advances in Neural Information Processing Systems, 2014.

[7]

I. M. Baytas, M. Yan, A. K. Jain, and J. Zhou. Asynchronous multi-task learning. In International Conference on Data Mining, 2016.

[8]

F. Bonomi, R. Milito, J. Zhu, and S. Addepalli. Fog computing and its role in the internet of things. In SIGCOMM Workshop on Mobile Cloud Computing, 2012.

Digital Library

[9]

A. Carroll and G. Heiser. An analysis of power consumption in a smartphone. In USENIX Annual Technical Conference, 2010.

Digital Library

[10]

R. Caruana. Multitask learning. Machine Learning, 28:41-75, 1997.

Digital Library

[11]

J. Chen, J. Zhou, and J. Ye. Integrating low-rank and group-sparse structures for robust multi-task learning. In Conference on Knowledge Discovery and Data Mining, 2011.

Digital Library

[12]

A. Deshpande, C. Guestrin, S. R. Madden, J. M. Hellerstein, and W. Hong. Model-based approximate querying in sensor networks. VLDB Journal, 14(4):417-443, 2005.

[13]

M. F. Duarte and Y. H. Hu. Vehicle classification in distributed sensor networks. Journal of Parallel and Distributed Computing, 64(7):826-838, 2004.

Digital Library

[14]

T. Evgeniou and M. Pontil. Regularized multi-task learning. In Conference on Knowledge Discovery and Data Mining, 2004.

Digital Library

[15]

P. Garcia Lopez, A. Montresor, D. Epema, A. Datta, T. Higashino, A. Iamnitchi, M. Barcellos, P. Felber, and E. Riviere. Edge-centric computing: Vision and challenges. SIGCOMM Computer Communication Review, 45(5):37-42, 2015.

Digital Library

[16]

A. R. Gonçalves, F. J. Von Zuben, and A. Banerjee. Multi-task sparse structure learning with gaussian copula models. Journal of Machine Learning Research, 17(33):1-30, 2016.

[17]

J. Gorski, F. Pfeuffer, and K. Klamroth. Biconvex sets and optimization with biconvex functions: a survey and extensions. Mathematical Methods of Operations Research, 66(3):373-407, 2007.

[18]

K. Hong, D. Lillethun, U. Ramachandran, B. Ottenwälder, and B. Koldehofe. Mobile fog: A programming model for large-scale applications on the internet of things. In SIGCOMM Workshop on Mobile Cloud Computing, 2013.

Digital Library

[19]

C.-J. Hsieh, M. A. Sustik, I. S. Dhillon, and P. Ravikumar. Sparse Inverse Covariance Matrix Estimation Using Quadratic Approximation. In Neural Information Processing Systems 27, 2014.

[20]

J. Huang, F. Qian, Y. Guo, Y. Zhou, Q. Xu, Z. M. Mao, S. Sen, and O. Spatscheck. An in-depth study of lte: Effect of network protocol and application behavior on performance. In ACM SIGCOMM Conference, 2013.

Digital Library

[21]

L. Jacob, J.-p. Vert, and F. R. Bach. Clustered multi-task learning: A convex formulation. In Neural Information Processing Systems, 2009.

[22]

M. Jaggi, V. Smith, J. Terhorst, S. Krishnan, T. Hofmann, and M. I. Jordan. Communication-Efficient Distributed Dual Coordinate Ascent. In Neural Information Processing Systems, 2014.

[23]

X. Jin, P. Luo, F. Zhuang, J. He, and Q. He. Collaborating between local and global learning for distributed online multiple tasks. In Conference on Information and Knowledge Management, 2015.

Digital Library

[24]

S. Kim and E. P. Xing. Statistical estimation of correlated genome associations to a quantitative trait network. PLoS Genet, 5(8):e1000587, 2009.

[25]

J. Konečný, H. B. McMahan, and D. Ramage. Federated optimization: Distributed optimization beyond the datacenter. arXiv:1511.03575, 2015.

[26]

J. Konečný, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh, and D. Bacon. Federated learning: Strategies for improving communication efficiency. arXiv:1610.05492, 2016.

[27]

T. Kuflik, J. Kay, and B. Kummerfeld. Challenges and solutions of ubiquitous user modeling. In Ubiquitous display environments, pages 7-30. Springer, 2012.

[28]

A. Kumar and H. Daumé. Learning task grouping and overlap in multi-task learning. In International Conference on Machine Learning, 2012.

[29]

S. L. Lauritzen. Graphical Models, volume 17. Clarendon Press, 1996.

[30]

S. Liu, S. J. Pan, and Q. Ho. Distributed multi-task relationship learning. Conference on Knowledge Discovery and Data Mining, 2017.

Digital Library

[31]

C. Ma, V. Smith, M. Jaggi, M. I. Jordan, P. Richtárik, and M. Takáč. Adding vs. averaging in distributed primal-dual optimization. In International Conference on Machine Learning, 2015.

[32]

S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. TAG: A tiny aggregation service for ad-hoc sensor networks. In Symposium on Operating Systems Design and Implementation, 2002.

[33]

S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. TinyDB: An acquisitional query processing system for sensor networks. ACM Transactions on Database Systems, 30(1):122-173, 2005.

Digital Library

[34]

J. Mairal, P. Koniusz, Z. Harchaoui, and C. Schmid. Convolutional kernel networks. In Neural Information Processing Systems, 2014.

[35]

D. Mateos-Núñez and J. Cortés. Distributed optimization for multi-task learning via nuclear-norm approximation. In IFAC Workshop on Distributed Estimation and Control in Networked Systems, 2015.

[36]

H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas. Communication-efficient learning of deep networks from decentralized data. In Conference on Artificial Intelligence and Statistics, 2017.

[37]

H. B. McMahan and D. Ramage. http://www.googblogs.com/federated-learning-collaborative-machine-learning-without-centralized-training-data/. Google, 2017.

[38]

A. P. Miettinen and J. K. Nurminen. Energy efficiency of mobile clients in cloud computing. In USENIX Conference on Hot Topics in Cloud Computing, 2010.

[39]

A. Pantelopoulos and N. G. Bourbakis. A survey on wearable sensor-based systems for health monitoring and prognosis. IEEE Transactions on Systems, Man, and Cybernetics, 40(1):1-12, 2010.

[40]

H. Qi, E. R. Sparks, and A. Talwalkar. Paleo: A performance model for deep neural networks. In International Conference on Learning Representations, 2017.

[41]

S. A. Rahman, C. Merck, Y. Huang, and S. Kleinberg. Unintrusive eating recognition using google glass. In Conference on Pervasive Computing Technologies for Healthcare, 2015.

[42]

P. Rashidi and D. J. Cook. Keeping the resident in the loop: Adapting the smart home to the user. IEEE Transactions on systems, man, and cybernetics, 39(5):949-959, 2009.

Digital Library

[43]

M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi. XNOR-Net: ImageNet classification using binary convolutional neural networks. In European Conference on Computer Vision, 2016.

[44]

S. Ravi. https://research.googleblog.com/2017/02/on-device-machine-intelligence. html. Google, 2017.

[45]

M. Razaviyayn, M. Hong, and Z.-Q. Luo. A unified convergence analysis of block successive minimization methods for nonsmooth optimization. SIAM Journal on Optimization, 23(2):1126-1153, 2013.

[46]

S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal Estimated sub-GrAdient SOlver for SVM. International Conference on Machine Learning, June 2007.

[47]

S. Shalev-Shwartz and T. Zhang. Stochastic dual coordinate ascent methods for regularized loss minimization. Journal of Machine Learning Research, 14:567-599, 2013.

Digital Library

[48]

D. Singelée, S. Seys, L. Batina, and I. Verbauwhede. The communication and computation cost of wireless security. In ACM Conference on Wireless Network Security, 2011.

Digital Library

[49]

V. Smith, S. Forte, C. Ma, M. Takác, M. I. Jordan, and M. Jaggi. CoCoA: A general framework for communication-efficient distributed optimization. arXiv:1611.02189, 2016.

[50]

M. Takáč, A. Bijral, P. Richtárik, and N. Srebro. Mini-Batch Primal and Dual Methods for SVMs. In International Conference on Machine Learning, 2013.

[51]

C.-Y. Tsai, A. M. Saxe, and D. Cox. Tensor switching networks. In Neural Information Processing Systems, 2016.

[52]

C. Van Berkel. Multi-core for mobile phones. In Proceedings of the Conference on Design, Automation and Test in Europe, pages 1260-1265. European Design and Automation Association, 2009.

Digital Library

[53]

H. Wang, A. Banerjee, C.-J. Hsieh, P. K. Ravikumar, and I. S. Dhillon. Large scale distributed sparse precision estimation. In Neural Information Processing Systems, 2013.

[54]

J. Wang, M. Kolar, and N. Srebro. Distributed multi-task learning. In Conference on Artificial Intelligence and Statistics, 2016.

[55]

J. Wang, M. Kolar, and N. Srebro. Distributed multi-task learning with shared representation. arXiv:1603.02185, 2016.

[56]

Y. Zhang, P. Liang, and M. J. Wainwright. Convexified convolutional neural networks. International Conference on Machine Learning, 2017.

[57]

Y. Zhang and D.-Y. Yeung. A convex formulation for learning task relationships in multi-task learning. In Conference on Uncertainty in Artificial Intelligence, 2010.

[58]

J. Zhou, J. Chen, and J. Ye. Clustered multi-task learning via alternating structure optimization. In Neural Information Processing Systems, 2011.

Cited By

Anjos JMatteussi KOrlandi FBarbosa JSilva JBittencourt LGeyer C(2023)A Survey on Collaborative Learning for Intelligent Autonomous SystemsACM Computing Surveys10.1145/362554456:4(1-37)Online publication date: 10-Nov-2023
https://dl.acm.org/doi/10.1145/3625544
Imran MYin HChen TNguyen QZhou AZheng K(2023)ReFRS: Resource-efficient Federated Recommender System for Dynamic and Diversified User PreferencesACM Transactions on Information Systems10.1145/356048641:3(1-30)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1145/3560486
Zhang LShi YChang YLin C(2023)Federated Fuzzy Neural Network With Evolutionary Rule LearningIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2022.320760731:5(1653-1664)Online publication date: 1-May-2023
https://dl.acm.org/doi/10.1109/TFUZZ.2022.3207607
Show More Cited By

Recommendations

Multimodal federated learning: Concept, methods, applications and future directions
Abstract
Multimodal learning mines and analyzes multimodal data in reality to better understand and appreciate the world around people. However, how to exploit this rich multimodal data without violating user privacy is a key issue. Federated learning is ...
Highlights
- The three different modes in the multimodal federated learning model are summarized.
- Multimodal fusion based on the federated learning framework is also specified.
- The difficulties and some ideas of multimodal federated learning ...
Federated Learning: A Distributed Shared Machine Learning Method
Federated learning (FL) is a distributed machine learning (ML) framework. In FL, multiple clients collaborate to solve traditional distributed ML problems under the coordination of the central server without sharing their local private data with others. ...
Personalized Federated Hypernetworks for Multi-Task Reinforcement Learning in Microgrid Energy Demand Response
BuildSys '23: Proceedings of the 10th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation

As sensors pervade the built environment, they have fueled the advance of data-driven models that promise greater efficiency for microgrid management. However, this has raised concerns over data privacy and data ownership. The paradigm of federated ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems

December 2017

7104 pages

ISBN:9781510860964

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 04 December 2017

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
456
Total Downloads

Downloads (Last 12 months)123
Downloads (Last 6 weeks)16

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Anjos JMatteussi KOrlandi FBarbosa JSilva JBittencourt LGeyer C(2023)A Survey on Collaborative Learning for Intelligent Autonomous SystemsACM Computing Surveys10.1145/362554456:4(1-37)Online publication date: 10-Nov-2023
https://dl.acm.org/doi/10.1145/3625544
Imran MYin HChen TNguyen QZhou AZheng K(2023)ReFRS: Resource-efficient Federated Recommender System for Dynamic and Diversified User PreferencesACM Transactions on Information Systems10.1145/356048641:3(1-30)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1145/3560486
Zhang LShi YChang YLin C(2023)Federated Fuzzy Neural Network With Evolutionary Rule LearningIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2022.320760731:5(1653-1664)Online publication date: 1-May-2023
https://dl.acm.org/doi/10.1109/TFUZZ.2022.3207607
Yang Q(2021)Toward Responsible AI: An Overview of Federated Learning for User-centered Privacy-preserving ComputingACM Transactions on Interactive Intelligent Systems10.1145/348587511:3-4(1-22)Online publication date: 25-Oct-2021
https://dl.acm.org/doi/10.1145/3485875
Zhuang WWen YZhang SShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)Joint Optimization in Edge-Cloud Continuum for Federated Unsupervised Person Re-identificationProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475182(433-441)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3475182
Zhang QGu BDang ZDeng CHuang HDemartini GZuccon GCulpepper JHuang ZTong H(2021)Desirable Companion for Vertical Federated LearningProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482249(2598-2607)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3459637.3482249
Luo JYang JYe XGuo XZhao WDemartini GZuccon GCulpepper JHuang ZTong H(2021)FedSkelProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482107(3283-3287)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3459637.3482107
Pfitzner BSteckhan NArnrich B(2021)Federated Learning in a Medical Context: A Systematic Literature ReviewACM Transactions on Internet Technology10.1145/341235721:2(1-31)Online publication date: 2-Jun-2021
https://dl.acm.org/doi/10.1145/3412357
Kim JShin SYu YLee JLee K(2020)Multiple Classification with Split LearningThe 9th International Conference on Smart Media and Applications10.1145/3426020.3426131(358-363)Online publication date: 17-Sep-2020
https://dl.acm.org/doi/10.1145/3426020.3426131
Dinh CTran NNguyen TBao WZomaya AZhou B(2020)Federated Learning with Proximal Stochastic Variance Reduced Gradient AlgorithmsProceedings of the 49th International Conference on Parallel Processing10.1145/3404397.3404457(1-11)Online publication date: 17-Aug-2020
https://dl.acm.org/doi/10.1145/3404397.3404457
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents