subMFL: Compatible subModel Generation for Federated Learning in Device Heterogeneous Environment

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14352))

Included in the following conference series:

European Conference on Parallel Processing

402 Accesses

The original version of this chapter was previously published without open access. A correction to this chapter is available at https://doi.org/10.1007/978-3-031-48803-0_41

Abstract

Federated Learning (FL) is commonly used in systems with distributed and heterogeneous devices with access to varying amounts of data and diverse computing and storage capacities. FL training process enables such devices to update the weights of a shared model locally using their local data and then a trusted central server combines all of those models to generate a global model. In this way, a global model is generated while the data remains local to devices to preserve privacy. However, training large models such as Deep Neural Networks (DNNs) on resource-constrained devices can take a prohibitively long time and consume a large amount of energy. In the current process, the low-capacity devices are excluded from the training process, although they might have access to unseen data. To overcome this challenge, we propose a model compression approach that enables heterogeneous devices with varying computing capacities to participate in the FL process. In our approach, the server shares a dense model with all devices to train it: Afterwards, the trained model is gradually compressed to obtain submodels with varying levels of sparsity to be used as suitable initial global models for resource-constrained devices that were not capable of train the first dense model. This results in an increased participation rate of resource-constrained devices while the transferred weights from the previous round of training are preserved. Our validation experiments show that despite reaching about 50% global sparsity, generated submodels maintain their accuracy while can be shared to increase participation by around 50%.

You have full access to this open access chapter, Download conference paper PDF

SuperFedNAS: Cost-Efficient Federated Neural Architecture Search for On-device Inference

PerHeFed: A general framework of personalized federated learning for heterogeneous convolutional neural networks

Article Open access 12 December 2022

FGFL: Fine-Grained Federated Learning Based on Neural Architecture Search for Heterogeneous Clients

Keywords

1 Introduction

1.1 Background

The widespread use of smart devices like smartphones, tablets, and Internet of Things (IoT) devices of various sizes and purposes, is driving the progress of services in smart environments, including smart cities, intelligent transport systems and infrastructure [1,2,3]. Furthermore, the massive quantity of edge devices is expected to generate extensive data requiring processing and analysis through automated methods. Machine learning can fuel the emergence of novel applications in smart environments by using those data [4]. Smart cities and their associated services such as intelligent traffic management, waste management, surveillance, and infrastructure monitoring are examples of such environments.

The exponential growth of generated data by IoT and mobile devices, along with demands for low latency computation, privacy, and scalability, drives the shift to edge computing. This approach enhances model training by placing computation nearer the data source hence reducing the data transmission latency. However, edge devices (i.e., edge nodes) often have limited computation power, storage, and energy capacity, making it challenging to run computationally intensive applications, mainly when a large amount of data must be processed [5].

Distributed machine learning refers to multi-node machine learning algorithms and systems that are designed to improve performance, increase accuracy, and scale to larger input data sizes [6]. Powerful parallel and distributed computing systems have recently become widely accessible in multi-core processors and cloud computing platforms that are applicable to problems traditionally addressed by centralised and sequential approaches [7]. Standard distributed learning involves training Deep Neural Networks (DNNs) on cloud servers and deploying them to edge nodes. However, this will not perform well for applications needing low latency, privacy, and scalability. Centralized model training demands data sharing, however, this may discourage data owners from granting access to their data for the purpose of model training.

In such a setting, machine learning models must be trained either at the same nodes that generate them (also can be defined as an agent, client, worker or device) or at a set of intermediate nodes, each collecting a subset of the data. Federated Learning (FL) [8] enables distributed machine learning across a large number of devices without requiring them to share their data with a central server. Once the devices train their local model using the devices’ local model parameters it is returned to the central servers to be aggregated with other sub-models and get distributed to all devices. A key challenge in deploying FL is the vast heterogeneity of devices [9], ranging from low-end IoT e.g., humidity sensors to mobile devices, as shown in Fig. 1, each having access to various types and amounts of data and hardware.

The widely accepted approach in FL requires all devices to use the same global model. However, this causes a problem when large-scale models such as DNNs with a large number of parameters must be used by resource-constrained devices. To use dense DNNs in FL systems, developers frequently choose to exclude such devices from training, which results in training bias and affects the model generality, due to excluding the data that was owned by such devices [10]. Another approach is to reduce the global model’s size by its depth or width, to accommodate the resource-constrained devices. However, this results in lower accuracy due to model capacity constraints [11]. Figure 2 shows a representation of the density range of possible initial global models by the width (dropped neurons or links) that own randomly generated weights in the Standard Federated Learning (S-FL). Picking one of the models arbitrarily as a global model to share with all devices to train leads to a tradeoff between participation rate and model learning constraints, due to the size of the selected model. While randomised model selection leads to this issue, DNNs pruning has the potential to generate sparse and suitable models. For instance, to utilise in FL, this compression technique is proposed to generate purposefully sparse models to address challenges such as communication overhead [12], data heterogeneity [13], and inclusion of heterogenous devices [14].

1.2 Related Works

Looking at the literature in more detail, FedSCR [12] reduces upstream communication by clustering parameter update patterns and using sparsity through structure-based pruning. Hermes [15] uses structured pruning to find small subnetworks on each device, and only updates from these subnetworks are communicated, improving communication and inference efficiency. AdaptCL [16] sends different stage pruned models to each device to synchronize the FL process in a heterogeneous environment and converge device update response times.

To address data heterogeneity, some methods cluster devices based on parameters and aggregate each cluster’s parameters separately [13]. In [17] a custom pruning was introduced that maximizes the coverage index via utilising a local pruning mask, considering both pruning-induced errors and the minimum coverage index, instead of solely preserving the largest parameters.

Considering the pruning in terms of device heterogeneity, in PruneFL [18], first an initial device is selected to prune the initial global model, and then further pruning involves both the server and devices during the FL process. Thus, a good starting point can be found for the FL process involving all devices. In [19], before the FL process, it is suggested to run local dataset-aware pruning, in order to achieve device-related models. In [20], it is proposed that during FL training, the server determines a pruning ratio and allocates wireless resources adaptively. Then, a threshold-based device selection strategy is used to further improve the learning performance. The approach of FedPrune [21] is to send randomly different sub-models to devices from the server side to find the optimum sub-model. FedMP [22] enables each device to avoid training the entire global model by determining specific pruning ratios. In FjORD [14], system diversity is considered and the same model size is not shared to all devices, instead, the model size is tailored to the devices.

1.3 Motivations and Contributions

There have been works investigating the implementation of DNNs pruning in FL. However, there is a need for determining a specific pruning ratio or the computation cost of the pruning process is left to the device side in existing studies that address device heterogeneity. This results in extra energy costs until finding the optimum trainable model architecture due to over pruning process on such resource-constrained devices.

In this paper, we focus on developing a novel model that enables heterogeneous devices to participate in the FL process by proposing a Compatible subModel Generation in FL (subMFL). subMFL aims to produce suitable submodels considering their initial accuracy despite their smaller size, instead of randomly generated smaller models in S-FL. In this model, an initial dense Global Model (GM) with low learning constraints is shared with all devices and it is trained by devices with enough resources. When training is completed, the model is pruned dataless to generate compatible submodels to be used by resource-constrained devices, without a need for prior knowledge of their computation and communication capabilities or determining a specific pruning ratio. For evaluation purposes, different threshold values are used to generate submodels with different sparsity levels. The accuracy of models and the level of participation of devices for each set are then reported. Our main contributions in this paper are as follows:

Server-side model pruning: The over-pruning process on the device side leads to extra energy loss. In our work, the pruning stage is completely carried to the server without the need for any data sample.
Compatible subModels generation: By assuring that the trained dense model’s weights are transferred to generated submodels, resource-constrained devices benefit from the data used to train GM by beginning to train a model with satisfactory accuracy.
Increased heterogeneous devices participation rate: subMFL tailors the FL paradigm, for environments that include heterogeneous devices with various levels of computational resources by assigning suitable pre-trained and compressed initial global models that fit their resources (Fig. 1).

2 Compatible SubModel Generation in Federated Learning

In this paper, we propose a Compatible subModel Generation model to enhance Federated Learning (subMFL) in environments with heterogeneous resource constraint devices with varying computational capacities. subMFL uses pruning to generate a set of compatible sparsed submodels using a trained dense Global Model (GM). Those will be initial global model architectures with suitable size that allows resource-constrained devices to join in the upcoming training cycles.

subMFL Flow: Algorithm 1 shows the overall flow of subMFL. The stages are: A dense global model is generated in the server and distributed to all devices to be trained (T represents the global training round and we set it as 100 in our simulation. \(W = [w_0, w_1, \ldots , w_n], \quad 0 \le w_i \le 1 \quad \text {for } 0 \le i \le n\) represents weights and \({W}_{GM}\) are weights of GM). GM is trained with capable devices and then a set of sparsed submodels (\(SM = [{sm_1, sm_2, ... , sm_9}]\)) is generated by pruning this GM using different threshold values. Afterwards, SM will be sent to all devices starting from the densest to the sparsest submodel, and each device chooses to train the densest compatible submodel based on its computational resources. At each step, the devices themself make a decision to join the next round based on their local model accuracy. Devices that reach their preferred accuracy exit the training process, and the next submodel is not shared with them. We represent devices with \(D = [{d_1, d_2, ... , d_{1000}}]\) and in line 5, we update this device set based on their preference. Thus, such devices do not consume further energy when the target accuracy is reached. Each component of subMFL is as follows:

Training: Algorithm 2 shows the training procedure we used, which is the process in standard federated learning (S-FL). Weights of the current global model are shared with D to train with T global round. At each round, local models (\( W _{LMs}^t\)) are collected and aggregated, to update the global model with new weights. Then, this updated \({W}_{GM}\) are shared with D to be updated again with their local datasets. In device heterogeneous environments, each device needs a different amount of time to complete its local training which causes synchronisation issues. On the other hand, in subMFL, devices that are slow to train the current model already cannot send local updates, however, the model is trained with higher capacity devices. Including GM, there is SM that will be trained and at each step of distributing a sparser submodel, devices that have near resource capacity train the distributed model, which leads the server to receive local models synchronously. In this way, we do not need pre-information about the devices’ computation capacity or determine a specific pruning ratio.

For the aggregation process, we use the FedAvg algorithm [23], which is an advanced aggregation strategy that has the benefits of convergence guarantees. This algorithm will be updated according to the current SM architecture.

Generating SubModels: DNNs pruning is used to generate the SM from the GM to be distributed to the resource-constrained devices. This will increase the participation of the heterogeneous devices that could not take part in the training process due to having a more limited resource capacity. In FL, due to security and privacy concerns, the server is unable to see any data sample which makes it unsuitable to prune DNNs on the server side with the majority of pruning methods. For this reason, we utilised a dataless pruning method on the server side, which is critical for real-world applications. In this way, all pruning processes are carried out at the server to decrease energy usage in resource-constrained devices. Also, we used an unstructured pruning strategy based on the L1-norm, due to its independence from network configuration [24].

Algorithm 3 shows submodels generation, where a Threshold variable that ranges from 0 to 0.9, increasing 0.1 each time is defined for pruning the GM. In this process, the weights of the GM are below the selected threshold will be set to 0. The remaining weights will be transferred from the current global model to the newly pruned submodel. Since the threshold is incremented by 0.1, GM produces 9 different submodels (\(sm_i \in SM\)) with various sparsification ratios. As shown in Fig. 3, GM (see the red model) is the dense model and SM (see the blue models) is generated using the pruned version of the trained GM.

Dropping Devices: When devices reach their target accuracy they don’t train the next SM (Fig. 1 represents the scenario, device-5 trains one of \(sm_i\), but doesn’t attend to train \(sm_9\)). For this reason, the server shares the next densest model only with devices that join the training. Algorithm 4 shows how the server updates D, which includes the devices that join the training. In line 3, \({d}_{jTargetMinAcc}\) shows minimum target accuracy for the device d. Following this approach, we reduce energy usage by omitting devices that reached the target.

As a result, instead of picking a random global model architecture generated with random weights as shown in Fig. 2, trained GM can produce SM, and then those SM can be shared to train with resource-constrained devices as sparser global models. In our approach, even though the next global models become smaller and have learning constraints due to compression, unlike S-FL, it keeps transferred weights from devices trained GM and benefits from their unseen data. Thus, GM is tuned to the available resource of devices, and devices can pick a \(sm_i\). This way, resource-constrained devices aren’t excluded from training.

3 Experiment

We used 1000 devices and shared data randomly with an equal sample size. 10% of devices can train the dense global model (GM) and the remaining devices that have lower computational capacity train one of \(sm_i\).

Datasets: Following the literature in this area, we used LeNet-5 [25] architecture with MNIST [26] and FMNIST [27] datasets which are used for image recognition tasks. While MNIST is a dataset of handwritten digits, FMNIST is a dataset of images depicting various clothing items.

Settings: We have performed the simulation using Pytorch [28] and Flower [29] framework. The global round is set to 100 and the local epoch is 3. The validation data percentage is 10 and the batch size is 64. We used Adam [30] optimiser with a 0.001 learning rate. The remaining parameters are as follow: betas = (0.9, 0.999), eps = 1e−08, weight decay = 0, amsgrad = False, foreach = None, maximize = False, capturable = False, “min fit clients” and “min eval clients” = 3.

Availability: In real-life scenarios, it is unlikely to receive parameters from all devices in every round due to factors such as mobility, low energy, and connection issues. Therefore, we assumed that only 30% of devices are available. If this number is decreased, the convergence time of GM increases.

Baseline: We used standard Federated Learning (S-FL) as our baseline.

Evaluation Metrics: Our evaluation metrics include accuracy (Acc), loss (Loss), participation number (P), and global sparsity (GS) of S-FL and subMFL. Server-side threshold-based model pruning is used to generate SM. For instance, when the threshold is set as 0.1, parameters of GM under 0.1 are reduced to 0, to generate the first submodel. To generate sparser submodels, the threshold increases until 0.9. By increasing the threshold, submodels become sparser, reducing computational cost and increasing the number of participating devices. We analyse metrics based on different threshold values and compare the results of subMFL generated SM with sparse models in S-FL using the same thresholds. The code of this work is publicly available at: https://github.com/zeyneddinoz/subMFL.

3.1 Results

Table 1. Metrics values based on thresholds for MNIST dataset.

Full size table

Table 1 reports the results we obtained from different thresholds (T, e.g. 0.1 to 0.9) to generate 9 different submodels (\(sm_i \in SM\), see Fig. 3) which includes accuracy, loss, number of participating devices (P) and global sparsity (GS) on generated models. To compare with S-FL, we picked models with different sparsification levels based on the same threshold values as shown in Fig. 2. In our experiments, the pruning method increases the sparsification of trained GM significantly, while maintaining good accuracy in subMFL. Parallel to the increased model sparsity, the number of participating devices increases.

Accuracy vs Global Sparsity: As shown in Fig. 4-a and Fig. 4-c, the results show that when the threshold value is incremented global sparsity increases. However, independent from the sparsification of models, S-FL accuracy remains around 10%, due to picked models always starting with randomly generated weights. On the other hand, in the beginning, the dense global model (GM) accuracy is the same as in S-FL, however, after training GM, generated submodels (SM) accuracy values are high. Thus, although the global model was sparsed, the transferred weights from previous training allowed the model to maintain a good level of accuracy. For instance, even though when global sparsity increases by around 50%, the accuracy decreases by only about 2% for the MNIST dataset. For the same condition, the accuracy percentage decreases approximately by 10, for the FMNIST dataset.

Accuracy vs Participation: As a result of compression, resource-constrained devices can train compressed submodels, and the participation number increases (see Fig. 4-b and Fig. 4-d). The percentage of models’ global sparsity increases the participation rate in parallel and both S-FL and subMFL have similar results. However, newly attended devices start to train a more accurate global model. For instance, the number of devices participating in the FL system increases by nearly 60% for both S-FL and subMFL. However, due to transferred weights from GM, SM shows good accuracy performance even before training. For the MNIST dataset, accuracy remains around 83%, until the Threshold value reaches 0.8. For the FMNIST dataset, the accuracy maintains higher than 60%, until the Threshold value reaches 0.7.

Comparing participation performance, assume that a middle-level sparse model is selected as a global model in S-FL. All devices will start to train a model with Threshold = 0.5 and random weights. Based on Fig. 4-b and Fig. 4-d, participation of devices will be under 45% and higher computation capacity devices will train a model with higher learning constraints, in order to accommodate resource-constrained devices. On the other hand, subMFL provides each device to train and own optimal models by sharing sparse models in descending order. Those shared models are SM generated by a pre-trained GM which results in good accuracy, despite being compressed models. Figure 4-b and Fig. 4-d show that subMFL increases participation up to 70%.

3.2 Discussion

Resource-constrained devices need smaller models to train and share their local models. In the case of selecting a small global model to share on the server side, due to the model’s learning constraints, the model cannot generalise patterns in datasets. In the other case when a large global model is shared, resource-constrained devices are unable to train the model, due to computational capacity. This leads to bias in trained models and affects performance negatively.

To provide dense DNNs to edge devices in heterogeneous environments, a flexible method should be utilised. State-of-the-art practice involves model pruning to compress these models for resource-limited devices. However, when the model pruning process is left to the device side, it results in extra energy consumption while they need to train their own local models. For this reason, it is necessary to generate methods to increase the heterogeneous device participation rate while pruning models on the server side. In this paper, we addressed this issue by serving compatible submodels to resource-constrained devices.

To sum up, only 10% of data is utilised to train GM due to 10% of devices being capable to train it. However, results show that it is possible to generate compatible SM via pruning GM with different threshold values. Those SM can be shared with resource-constrained devices as new global models to train. Thus, the number of participating devices increases, and since SM owns tuned parameters from trained GM, those SM start with good accuracy.

The core idea of our work is to show that instead of selecting a random global model with a performance of around 10% accuracy and distributing it to train (the S-FL approach), starting to train a dense global model and then pruning it to generate submodels is a useful approach, due to transferred pre-trained weights result in compressed submodels with a good accuracy performance. Even though new training rounds begin, those submodels can be served to resource-constrained devices that need to participate with a reasonable starting accuracy. Thus, at the end of the process, each device owns the optimal trainable model.

4 Conclusion and Future Works

In this paper, we proposed subMFL which is a submodel generation technique using model pruning to increase the participation of heterogeneous devices for federated learning. This is done without a need for prior information on devices’ hardware/computing capabilities or determining a specific pruning ratio. In our approach, a dense model is distributed to all devices in the system for training. Then, the trained model is pruned gradually in the server without the need for a data sample, so as to generate a set of submodels. Those submodels are shared with resource-constrained devices to train as compatible sparsed models to raise participation numbers. Also, since sparsed submodels hold tuned weights from the trained dense model, they have satisfactory accuracy even before training, despite being compressed.

Future work could address more in-depth theoretical research regarding improving submodels used as global models that are trained with different device groups by combining their parameters to increase models’ generality. Additionally, advanced compression methods can be used to reduce communication overhead in addition to tackling device heterogeneity.

Change history

04 September 2024
A correction has been published.

References

Golpayegani, F., Ghanadbashi, S., Riad, M.: Urban emergency management using intelligent traffic systems: challenges and future directions. In: 2021 IEEE International Smart Cities Conference (ISC2), pp. 1–4. IEEE (2021)
Google Scholar
Ghanadbashi, S., Golpayegani, F.: An ontology-based intelligent traffic signal control model. In: 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), pp. 2554–2561. IEEE (2021)
Google Scholar
Malekjafarian, A., Golpayegani, F., Moloney, C., Clarke, S.: A machine learning approach to bridge-damage detection using responses measured on a passing vehicle. Sensors 19(18), 4035 (2019)
Article Google Scholar
Lv, Z., Chen, D., Lou, R., Wang, Q.: Intelligent edge computing based on machine learning for smart city. Futur. Gener. Comput. Syst. 115, 90–99 (2021)
Article Google Scholar
Safavifar, Z., Ghanadbashi, S., Golpayegani, F.: Adaptive workload orchestration in pure edge computing: a reinforcement-learning model. In: 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI), pp. 856–860. IEEE (2021)
Google Scholar
Galakatos, A., Crotty, A., Kraska, T.: Distributed machine learning (2018)
Google Scholar
Peteiro-Barral, D., Guijarro-Berdiñas, B.: A survey of methods for distributed machine learning. Prog. Artif. Intell. 2, 1–11 (2013)
Article Google Scholar
McMahan, H.B., Moore, E., Ramage, D., y Arcas, B.A.: Federated learning of deep networks using model averaging. arXiv preprint arXiv:1602.05629 2, 2 (2016)
Li, T., Sahu, A.K., Talwalkar, A., Smith, V.: Federated learning: challenges, methods, and future directions. IEEE Signal Process. Mag. 37(3), 50–60 (2020)
Article Google Scholar
Kairouz, P., et al.: Advances and open problems in federated learning. Foundations and Trends in Machine Learning (2021)
Google Scholar
Caldas, S., Konečny, J., McMahan, H.B., Talwalkar, A.: Expanding the reach of federated learning by reducing client resource requirements. arXiv:1812.07210 (2018)
Wu, X., Yao, X., Wang, C.L.: FedSCR: structure-based communication reduction for federated learning. IEEE Trans. Parallel Distrib. Syst. 32(7), 1565–1577 (2020)
Google Scholar
Vahidian, S., Morafah, M., Lin, B.: Personalized federated learning by structured and unstructured pruning under data heterogeneity. In: 2021 IEEE 41st International Conference on Distributed Computing Systems Workshops (ICDCSW) (2021)
Google Scholar
Horvath, S., Laskaridis, S., Almeida, M., Leontiadis, I., Venieris, S., Lane, N.: FjORD: fair and accurate federated learning under heterogeneous targets with ordered dropout. In: Advances in Neural Information Processing Systems, vol. 34 (2021)
Google Scholar
Li, A., Sun, J., Li, P., Pu, Y., Li, H., Chen, Y.: Hermes: an efficient federated learning framework for heterogeneous mobile clients. In: Proceedings of the 27th Annual International Conference on Mobile Computing and Networking, pp. 420–437 (2021)
Google Scholar
Zhou, G., Xu, K., Li, Q., Liu, Y., Zhao, Y.: AdaptCL: efficient collaborative learning with dynamic and adaptive pruning. arXiv preprint arXiv:2106.14126 (2021)
Zhou, H., Lan, T., Venkataramani, G., Ding, W.: On the convergence of heterogeneous federated learning with arbitrary adaptive online model pruning. arXiv preprint arXiv:2201.11803 (2022)
Jiang, Y., et al.: Model pruning enables efficient federated learning on edge devices. IEEE Trans. Neural Networks Learn. Syst. 34(12), 10374–10386 (2022)
Article Google Scholar
Yu, S., Nguyen, P., Anwar, A., Jannesari, A.: Adaptive dynamic pruning for Non-IID federated learning. arXiv preprint arXiv:2106.06921 (2021)
Liu, S., Yu, G., Yin, R., Yuan, J., Shen, L., Liu, C.: Joint model pruning and device selection for communication-efficient federated edge learning. IEEE Trans. Commun. 70(1), 231–244 (2021)
Article Google Scholar
Munir, M.T., Saeed, M.M., Ali, M., Qazi, Z.A., Qazi, I.A.: FedPrune: towards inclusive federated learning. arXiv preprint arXiv:2110.14205 (2021)
Jiang, Z., Xu, Y., Xu, H., Wang, Z., Qiao, C., Zhao, Y.: FedMP: federated learning through adaptive model pruning in heterogeneous edge computing. In: 2022 IEEE 38th International Conference on Data Engineering (ICDE), pp. 767–779 (2022)
Google Scholar
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017)
Google Scholar
Anwar, S., Hwang, K., Sung, W.: Structured pruning of deep convolutional neural networks. ACM J. Emerg. Technol. Comput. Syst. (JETC) 13(3), 1–18 (2017)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
LeCun, Y.: The MNIST database of handwritten digits (1998). http://yann.lecun.com/exdb/mnist/
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Beutel, D.J., et al.: Flower: a friendly federated learning research framework. arXiv preprint arXiv:2007.14390 (2020)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

Download references

Acknowledgment

This project has received funding from RE-ROUTE Project, the European Union’s Horizon Europe research and innovation programme under the Marie Skłodowska-Curie grant agreement No 101086343.

Author information

Authors and Affiliations

School of Computer Science, University College Dublin, Belfield, Dublin, Ireland
Zeyneddin Oz, Nima Afraz & Fatemeh Golpayegani
School of Civil Engineering, University College Dublin, Belfield, Dublin, Ireland
Abdollah Malekjafarian
DOCOsoft, NexusUCD, Belfield Office Park, Dublin, Ireland
Ceylan Soygul Oz

Authors

Zeyneddin Oz
View author publications
You can also search for this author in PubMed Google Scholar
Ceylan Soygul Oz
View author publications
You can also search for this author in PubMed Google Scholar
Abdollah Malekjafarian
View author publications
You can also search for this author in PubMed Google Scholar
Nima Afraz
View author publications
You can also search for this author in PubMed Google Scholar
Fatemeh Golpayegani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zeyneddin Oz .

Editor information

Editors and Affiliations

University of Cyprus, Nicosia, Cyprus
Demetris Zeinalipour
University of Santiago de Compostela, Santiago de Compostela, Spain
Dora Blanco Heras
University of Cyprus, Nicosia, Cyprus
George Pallis
Cyprus University of Technology, Limassol, Cyprus
Herodotos Herodotou
University of Nicosia, Nicosia, Cyprus
Demetris Trihinas
Inria, Nantes, France
Daniel Balouek
Louisiana State University, Baton Rouge, LA, USA
Patrick Diehl
Karlsruhe Institute of Technology, Karlsruhe, Germany
Terry Cojean
Ludwig-Maximilians-Universität, Munich, Germany
Karl Fürlinger
Roskilde University, Roskilde, Denmark
Maja Hanne Kirkeby
Bank of Italy, Rome, Italy
Matteo Nardelli
Roma Tre University, Rome, Italy
Pierangelo Di Sanzo

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

Oz, Z., Soygul Oz, C., Malekjafarian, A., Afraz, N., Golpayegani, F. (2024). subMFL: Compatible subModel Generation for Federated Learning in Device Heterogeneous Environment. In: Zeinalipour, D., et al. Euro-Par 2023: Parallel Processing Workshops. Euro-Par 2023. Lecture Notes in Computer Science, vol 14352. Springer, Cham. https://doi.org/10.1007/978-3-031-48803-0_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-48803-0_5
Published: 14 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48802-3
Online ISBN: 978-3-031-48803-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us