Federated Learning (FL) enables learning a shared model acrossmany clients without violating the privacy requirements. One of the key attributes in FL is the heterogeneity that exists in both resource and data due to the differences in computation and communication capacity, as well as the quantity and content of data among different clients. We conduct a case study to show that heterogeneity in resource and data has a significant impact on training time and model accuracy in conventional FL systems. To this end, we propose TiFL, a Tier-based Federated Learning System, which divides clients into tiers based on their training performance and selects clients from the same tier in each training round to mitigate the straggler problem caused by heterogeneity in resource anddata quantity. To further tame the heterogeneity caused by non-IID (Independent and Identical Distribution) data and resources, TiFL employs an adaptive tier selection approach to update the tiering on-the-fly based on the observed training performance and accuracy. We prototype TiFL in a FL testbed following Google's FL architecture and evaluate it using the state-of-the-art FL benchmarks. Experimental evaluation shows that TiFL outperforms the conventional FL in various heterogeneous conditions. With the proposed adaptive tier selection policy, we demonstrate that TiFL achieves much faster training performance while achieving the same or better test accuracy across the board.

Supplementary Material

MP4 File (3369583.3392686.mp4)

In this presentation, we present our work which demonstrates the effects that data and resource heterogeneity have on Federated learning systems, and based on those observations we propose a system that can reduce the training time of the federated learning process without compromising model accuracy by using an adaptive scheduling policy. This work was done in collaboration with George-Mason University, University of Nevada, Reno, and IBM Research Almaden.

Download
81.14 MB

References

[1]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 265--283.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Contrastive encoder pre-training-based clustered federated learning for heterogeneous data

FedHiSyn: A Hierarchical Synchronous Federated Learning Framework for Resource and Data Heterogeneity

Federated semi-supervised learning with tolerant guidance and powerful classifier in edge scenarios

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations