Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3294996.3295196guideproceedingsArticle/Chapter ViewAbstractPublication PagesnipsConference Proceedingsconference-collections
Article
Free access

Federated multi-task learning

Published: 04 December 2017 Publication History

Abstract

Federated learning poses new statistical and systems challenges in training machine learning models over distributed networks of devices. In this work, we show that multi-task learning is naturally suited to handle the statistical challenges of this setting, and propose a novel systems-aware optimization method, MOCHA, that is robust to practical systems issues. Our method and theory for the first time consider issues of high communication cost, stragglers, and fault tolerance for distributed multi-task learning. The resulting method achieves significant speedups compared to alternatives in the federated setting, as we demonstrate through simulations on real-world federated datasets.

References

[1]
A. Ahmed, A. Das, and A. J. Smola. Scalable hierarchical multitask learning algorithms for conversion optimization in display advertising. In Conference on Web Search and Data Mining, 2014.
[2]
R. K. Ando and T. Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research, 6:1817-1853, 2005.
[3]
D. Anguita, A. Ghio, L. Oneto, X. Parra, and J. L. Reyes-Ortiz. A public domain dataset for human activity recognition using smartphones. In European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2013.
[4]
A. Argyriou, T. Evgeniou, and M. Pontil. Multi-task feature learning. In Neural Information Processing Systems, 2007.
[5]
A. Argyriou, T. Evgeniou, and M. Pontil. Convex multi-task feature learning. Machine Learning, 73(3):243-272, 2008.
[6]
Ö. Aslan, X. Zhang, and D. Schuurmans. Convex deep learning via normalized kernels. In Advances in Neural Information Processing Systems, 2014.
[7]
I. M. Baytas, M. Yan, A. K. Jain, and J. Zhou. Asynchronous multi-task learning. In International Conference on Data Mining, 2016.
[8]
F. Bonomi, R. Milito, J. Zhu, and S. Addepalli. Fog computing and its role in the internet of things. In SIGCOMM Workshop on Mobile Cloud Computing, 2012.
[9]
A. Carroll and G. Heiser. An analysis of power consumption in a smartphone. In USENIX Annual Technical Conference, 2010.
[10]
R. Caruana. Multitask learning. Machine Learning, 28:41-75, 1997.
[11]
J. Chen, J. Zhou, and J. Ye. Integrating low-rank and group-sparse structures for robust multi-task learning. In Conference on Knowledge Discovery and Data Mining, 2011.
[12]
A. Deshpande, C. Guestrin, S. R. Madden, J. M. Hellerstein, and W. Hong. Model-based approximate querying in sensor networks. VLDB Journal, 14(4):417-443, 2005.
[13]
M. F. Duarte and Y. H. Hu. Vehicle classification in distributed sensor networks. Journal of Parallel and Distributed Computing, 64(7):826-838, 2004.
[14]
T. Evgeniou and M. Pontil. Regularized multi-task learning. In Conference on Knowledge Discovery and Data Mining, 2004.
[15]
P. Garcia Lopez, A. Montresor, D. Epema, A. Datta, T. Higashino, A. Iamnitchi, M. Barcellos, P. Felber, and E. Riviere. Edge-centric computing: Vision and challenges. SIGCOMM Computer Communication Review, 45(5):37-42, 2015.
[16]
A. R. Gonçalves, F. J. Von Zuben, and A. Banerjee. Multi-task sparse structure learning with gaussian copula models. Journal of Machine Learning Research, 17(33):1-30, 2016.
[17]
J. Gorski, F. Pfeuffer, and K. Klamroth. Biconvex sets and optimization with biconvex functions: a survey and extensions. Mathematical Methods of Operations Research, 66(3):373-407, 2007.
[18]
K. Hong, D. Lillethun, U. Ramachandran, B. Ottenwälder, and B. Koldehofe. Mobile fog: A programming model for large-scale applications on the internet of things. In SIGCOMM Workshop on Mobile Cloud Computing, 2013.
[19]
C.-J. Hsieh, M. A. Sustik, I. S. Dhillon, and P. Ravikumar. Sparse Inverse Covariance Matrix Estimation Using Quadratic Approximation. In Neural Information Processing Systems 27, 2014.
[20]
J. Huang, F. Qian, Y. Guo, Y. Zhou, Q. Xu, Z. M. Mao, S. Sen, and O. Spatscheck. An in-depth study of lte: Effect of network protocol and application behavior on performance. In ACM SIGCOMM Conference, 2013.
[21]
L. Jacob, J.-p. Vert, and F. R. Bach. Clustered multi-task learning: A convex formulation. In Neural Information Processing Systems, 2009.
[22]
M. Jaggi, V. Smith, J. Terhorst, S. Krishnan, T. Hofmann, and M. I. Jordan. Communication-Efficient Distributed Dual Coordinate Ascent. In Neural Information Processing Systems, 2014.
[23]
X. Jin, P. Luo, F. Zhuang, J. He, and Q. He. Collaborating between local and global learning for distributed online multiple tasks. In Conference on Information and Knowledge Management, 2015.
[24]
S. Kim and E. P. Xing. Statistical estimation of correlated genome associations to a quantitative trait network. PLoS Genet, 5(8):e1000587, 2009.
[25]
J. Konečný, H. B. McMahan, and D. Ramage. Federated optimization: Distributed optimization beyond the datacenter. arXiv:1511.03575, 2015.
[26]
J. Konečný, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh, and D. Bacon. Federated learning: Strategies for improving communication efficiency. arXiv:1610.05492, 2016.
[27]
T. Kuflik, J. Kay, and B. Kummerfeld. Challenges and solutions of ubiquitous user modeling. In Ubiquitous display environments, pages 7-30. Springer, 2012.
[28]
A. Kumar and H. Daumé. Learning task grouping and overlap in multi-task learning. In International Conference on Machine Learning, 2012.
[29]
S. L. Lauritzen. Graphical Models, volume 17. Clarendon Press, 1996.
[30]
S. Liu, S. J. Pan, and Q. Ho. Distributed multi-task relationship learning. Conference on Knowledge Discovery and Data Mining, 2017.
[31]
C. Ma, V. Smith, M. Jaggi, M. I. Jordan, P. Richtárik, and M. Takáč. Adding vs. averaging in distributed primal-dual optimization. In International Conference on Machine Learning, 2015.
[32]
S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. TAG: A tiny aggregation service for ad-hoc sensor networks. In Symposium on Operating Systems Design and Implementation, 2002.
[33]
S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. TinyDB: An acquisitional query processing system for sensor networks. ACM Transactions on Database Systems, 30(1):122-173, 2005.
[34]
J. Mairal, P. Koniusz, Z. Harchaoui, and C. Schmid. Convolutional kernel networks. In Neural Information Processing Systems, 2014.
[35]
D. Mateos-Núñez and J. Cortés. Distributed optimization for multi-task learning via nuclear-norm approximation. In IFAC Workshop on Distributed Estimation and Control in Networked Systems, 2015.
[36]
H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas. Communication-efficient learning of deep networks from decentralized data. In Conference on Artificial Intelligence and Statistics, 2017.
[37]
H. B. McMahan and D. Ramage. http://www.googblogs.com/federated-learning-collaborative-machine-learning-without-centralized-training-data/. Google, 2017.
[38]
A. P. Miettinen and J. K. Nurminen. Energy efficiency of mobile clients in cloud computing. In USENIX Conference on Hot Topics in Cloud Computing, 2010.
[39]
A. Pantelopoulos and N. G. Bourbakis. A survey on wearable sensor-based systems for health monitoring and prognosis. IEEE Transactions on Systems, Man, and Cybernetics, 40(1):1-12, 2010.
[40]
H. Qi, E. R. Sparks, and A. Talwalkar. Paleo: A performance model for deep neural networks. In International Conference on Learning Representations, 2017.
[41]
S. A. Rahman, C. Merck, Y. Huang, and S. Kleinberg. Unintrusive eating recognition using google glass. In Conference on Pervasive Computing Technologies for Healthcare, 2015.
[42]
P. Rashidi and D. J. Cook. Keeping the resident in the loop: Adapting the smart home to the user. IEEE Transactions on systems, man, and cybernetics, 39(5):949-959, 2009.
[43]
M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi. XNOR-Net: ImageNet classification using binary convolutional neural networks. In European Conference on Computer Vision, 2016.
[44]
S. Ravi. https://research.googleblog.com/2017/02/on-device-machine-intelligence. html. Google, 2017.
[45]
M. Razaviyayn, M. Hong, and Z.-Q. Luo. A unified convergence analysis of block successive minimization methods for nonsmooth optimization. SIAM Journal on Optimization, 23(2):1126-1153, 2013.
[46]
S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal Estimated sub-GrAdient SOlver for SVM. International Conference on Machine Learning, June 2007.
[47]
S. Shalev-Shwartz and T. Zhang. Stochastic dual coordinate ascent methods for regularized loss minimization. Journal of Machine Learning Research, 14:567-599, 2013.
[48]
D. Singelée, S. Seys, L. Batina, and I. Verbauwhede. The communication and computation cost of wireless security. In ACM Conference on Wireless Network Security, 2011.
[49]
V. Smith, S. Forte, C. Ma, M. Takác, M. I. Jordan, and M. Jaggi. CoCoA: A general framework for communication-efficient distributed optimization. arXiv:1611.02189, 2016.
[50]
M. Takáč, A. Bijral, P. Richtárik, and N. Srebro. Mini-Batch Primal and Dual Methods for SVMs. In International Conference on Machine Learning, 2013.
[51]
C.-Y. Tsai, A. M. Saxe, and D. Cox. Tensor switching networks. In Neural Information Processing Systems, 2016.
[52]
C. Van Berkel. Multi-core for mobile phones. In Proceedings of the Conference on Design, Automation and Test in Europe, pages 1260-1265. European Design and Automation Association, 2009.
[53]
H. Wang, A. Banerjee, C.-J. Hsieh, P. K. Ravikumar, and I. S. Dhillon. Large scale distributed sparse precision estimation. In Neural Information Processing Systems, 2013.
[54]
J. Wang, M. Kolar, and N. Srebro. Distributed multi-task learning. In Conference on Artificial Intelligence and Statistics, 2016.
[55]
J. Wang, M. Kolar, and N. Srebro. Distributed multi-task learning with shared representation. arXiv:1603.02185, 2016.
[56]
Y. Zhang, P. Liang, and M. J. Wainwright. Convexified convolutional neural networks. International Conference on Machine Learning, 2017.
[57]
Y. Zhang and D.-Y. Yeung. A convex formulation for learning task relationships in multi-task learning. In Conference on Uncertainty in Artificial Intelligence, 2010.
[58]
J. Zhou, J. Chen, and J. Ye. Clustered multi-task learning via alternating structure optimization. In Neural Information Processing Systems, 2011.

Cited By

View all
  • (2023)A Survey on Collaborative Learning for Intelligent Autonomous SystemsACM Computing Surveys10.1145/362554456:4(1-37)Online publication date: 10-Nov-2023
  • (2023)ReFRS: Resource-efficient Federated Recommender System for Dynamic and Diversified User PreferencesACM Transactions on Information Systems10.1145/356048641:3(1-30)Online publication date: 7-Feb-2023
  • (2023)Federated Fuzzy Neural Network With Evolutionary Rule LearningIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2022.320760731:5(1653-1664)Online publication date: 1-May-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems
December 2017
7104 pages

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 04 December 2017

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)123
  • Downloads (Last 6 weeks)16
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)A Survey on Collaborative Learning for Intelligent Autonomous SystemsACM Computing Surveys10.1145/362554456:4(1-37)Online publication date: 10-Nov-2023
  • (2023)ReFRS: Resource-efficient Federated Recommender System for Dynamic and Diversified User PreferencesACM Transactions on Information Systems10.1145/356048641:3(1-30)Online publication date: 7-Feb-2023
  • (2023)Federated Fuzzy Neural Network With Evolutionary Rule LearningIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2022.320760731:5(1653-1664)Online publication date: 1-May-2023
  • (2021)Toward Responsible AI: An Overview of Federated Learning for User-centered Privacy-preserving ComputingACM Transactions on Interactive Intelligent Systems10.1145/348587511:3-4(1-22)Online publication date: 25-Oct-2021
  • (2021)Joint Optimization in Edge-Cloud Continuum for Federated Unsupervised Person Re-identificationProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475182(433-441)Online publication date: 17-Oct-2021
  • (2021)Desirable Companion for Vertical Federated LearningProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482249(2598-2607)Online publication date: 26-Oct-2021
  • (2021)FedSkelProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482107(3283-3287)Online publication date: 26-Oct-2021
  • (2021)Federated Learning in a Medical Context: A Systematic Literature ReviewACM Transactions on Internet Technology10.1145/341235721:2(1-31)Online publication date: 2-Jun-2021
  • (2020)Multiple Classification with Split LearningThe 9th International Conference on Smart Media and Applications10.1145/3426020.3426131(358-363)Online publication date: 17-Sep-2020
  • (2020)Federated Learning with Proximal Stochastic Variance Reduced Gradient AlgorithmsProceedings of the 49th International Conference on Parallel Processing10.1145/3404397.3404457(1-11)Online publication date: 17-Aug-2020
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media