poster

A Reinforcement Learning Approach for Minimizing Job Completion Time in Clustered Federated Learning

Authors:

Ruiting Zhou,

Jieling YuAuthors Info & Claims

ACM TURC '23: Proceedings of the ACM Turing Award Celebration Conference - China 2023

Pages 55 - 56

https://doi.org/10.1145/3603165.3607394

Published: 25 September 2023 Publication History

Get Access

Abstract

Federated Learning (FL) enables potentially a large number of clients to collaboratively train a global model with the coordination of a central cloud server without exposing client raw data. However, the FL model convergence performance, often measured by the job completion time, is hindered by two critical factors: non independent and identically distributed (non-IID) data across clients and the straggler effect. In this work, we propose a clustered FL framework, MCFL, to minimize the job completion time by mitigating the influence of non-IID data and the straggler effect while guaranteeing the FL model convergence performance. MCFL builds upon a two-stage operation: i) a clustering algorithm constructs clusters, each containing clients with similar computing and communications capabilities to combat the straggler effect within a cluster; ii) a deep reinforcement learning (DRL) algorithm based on soft actor-critic with discrete actions intelligently selects a subset of clients from each cluster to mitigate the impact of non-IID data, and derives the number of intra-cluster aggregation iterations for each cluster to reduce the straggler effect among clusters. Extensive testbed experiments are conducted under various configurations to verify the efficacy of MCFL. The results show that MCFL can reduce the job completion time by up to compared with three state-of-the-art FL frameworks.

References

[1]

Ahmed Khaled, Konstantin Mishchenko, and Peter Richtárik. [n. d.]. Tighter theory for local SGD on identical and heterogeneous data. In Proc. of AISTATS.

Google Scholar

[2]

Meena Mahajan, Prajakta Nimbhorkar, and Kasturi Varadarajan. 2012. The planar k-means problem is NP-hard. Theoretical Computer Science 442 (2012), 13–21.

Digital Library

Google Scholar

[3]

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Proc. of PMLR AISTATS.

Google Scholar

[4]

Grigorios Tzortzis and Aristidis Likas. 2014. The MinMax k-Means clustering algorithm. Pattern Recognition 47 (2014), 2505–2516.

Crossref

Google Scholar

Recommendations

Minimizing Total Completion Time Subject to Job Release Dates and Preemption Penalties

Extensive research has been devoted to preemptive scheduling. However, little attention has been paid to problems where a certain time penalty must be incurred if preemption is allowed. In this paper, we consider the single-machine scheduling problem of ...
Minimizing total completion time on uniform machines with deadline constraints

Consider n independent jobs and m uniform machines in parallel. Each job has a processing requirement and a deadline. All jobs are available for processing at time t = 0. Job j must complete its processing before or at its deadline and preemptions are ...
Minimizing Total Completion Time on Parallel Machines with Deadline Constraints

Consider n independent jobs and m identical machines in parallel. Job j has a processing time p_j and a deadline $\bar{d}_j$. It must complete its processing before or at its deadline. All jobs are available for processing at time t=0 and preemptions are ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

ACM TURC '23: Proceedings of the ACM Turing Award Celebration Conference - China 2023

July 2023

173 pages

ISBN:9798400702334

DOI:10.1145/3603165

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 September 2023

Check for updates

Qualifiers

Poster
Research
Refereed limited

Conference

ACM TURC '23

ACM TURC '23: ACM Turing Award Celebration Conference 2023

July 28 - 30, 2023

Wuhan, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
150
Total Downloads

Downloads (Last 12 months)150
Downloads (Last 6 weeks)1

Reflects downloads up to 26 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Abstract

References

Recommendations

Minimizing Total Completion Time Subject to Job Release Dates and Preemption Penalties

Minimizing total completion time on uniform machines with deadline constraints

Minimizing Total Completion Time on Parallel Machines with Deadline Constraints

Comments

Information

Published In

Publisher

Publication History

Check for updates

Qualifiers

Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Get Access

Login options

Full Access

View options

PDF

eReader

HTML Format

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations