Evaluating POWER Architecture for Distributed Training of Generative Adversarial Networks

Ahmad Hesam^12,13,
Sofia Vallecorsa¹³,
Gulrukh Khattak¹³ &
…
Federico Carminati¹³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11887))

Included in the following conference series:

International Conference on High Performance Computing

6083 Accesses

Abstract

The increased availability of High-Performance Computing resources can enable data scientists to deploy and evaluate data-driven approaches, notably in the field of deep learning, at a rapid pace. As deep neural networks become more complex and are ingesting increasingly larger datasets, it becomes unpractical to perform the training phase on single machine instances due to memory constraints, and extremely long training time. Rather than scaling up, scaling out the computing resources is a productive approach to improve performance. The paradigm of data parallelism allows us to split the training dataset into manageable chunks that can be processed in parallel. In this work, we evaluate the scaling performance of training a 3D generative adversarial network (GAN) on an IBM POWER8 cluster, equipped with 12 NVIDIA P100 GPUs. The full training duration of the GAN, including evaluation, is reduced from 20 h and 16 min on a single GPU, to 2 h and 14 min on all 12 GPUs. We achieve a scaling efficiency of 98.9% when scaling from 1 to 12 GPUs, taking only the training process into consideration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Stable parallel training of Wasserstein conditional generative adversarial neural networks

Article 03 August 2022

Parallel-Distributed Implementation of the Lipizzaner Framework for Multiobjective Coevolutionary Training of Generative Adversarial Networks

Large scale performance analysis of distributed deep learning frameworks for convolutional neural networks

Article Open access 08 June 2023

Notes

1.
Except for the first epoch, which includes the cuDNN configuration time. For this reason, we start timing from the second epoch.

References

Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/
Aicheler, M., et al.: A multi-TeV linear collider based on CLIC technology: CLIC conceptual design report. CERN Yellow Reports: Monographs, CERN, Geneva (2012). https://doi.org/10.5170/CERN-2012-007. https://cds.cern.ch/record/1500095
Bird, I.: Workshop introduction, context of the workshop: half-way through run2; preparing for run3, run4. WLCG Workshop (2016)
Google Scholar
Carminati, F., Khattak, G., Pierini, M., Vallecor-safa, S., Farbin, A.: Calorimetry with deep learning: particle classification, energy regression, and simulation for high-energy physics. In: NIPS (2017)
Google Scholar
Chollet, F., et al.: Keras (2015). https://github.com/fchollet/keras
Gabriel, E., et al.: Open MPI: goals, concept, and design of a next generation MPI implementation. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J. (eds.) EuroPVM/MPI 2004. LNCS, vol. 3241, pp. 97–104. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30218-6_19
Chapter Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
Goodfellow, I.J.: On distinguishability criteria for estimating generative models. arXiv preprint arXiv:1412.6515 (2014)
Hill, M.D., Marty, M.R.: Amdahl’s law in the multicore era. Computer 41(7), 33–38 (2008)
Article Google Scholar
Khattak, G., Vallecorsa, S., Carminati, F.: Three dimensional energy parametrized generative adversarial networks for electromagnetic shower simulation. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 3913–3917, October 2018
Google Scholar
Krizhevsky, A.: One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997 (2014)
Kurth, T., Smorkalov, M., Mendygral, P., Sridharan, S., Mathuriya, A.: Tensorflow at scale: performance and productivity analysis of distributed training with Horovod, MLSL, and Cray PE ML. Concurr. Comput.: Pract. Exp. (2018). https://doi.org/10.1002/cpe.4989
Le, T.D., Imai, H., Negishi, Y., Kawachiya, K.: TFLMS: large model support in TensorFlow by graph rewriting. arXiv preprint arXiv:1807.02037 (2018)
Mescheder, L., Geiger, A., Nowozin, S.: Which training methods for GANs do actually converge? arXiv preprint arXiv:1801.04406 (2018)
Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier GANs. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 2642–2651. JMLR.org (2017)
Google Scholar
de Oliveira, L., Paganini, M., Nachman, B.: Learning particle physics by example: location-aware generative adversarial networks for physics synthesis. Comput. Softw. Big Sci. 1(1), 4 (2017)
Article Google Scholar
Paganini, M., de Oliveira, L., Nachman, B.: CaloGAN: simulating 3D high energy particle showers in multilayer electromagnetic calorimeters with generative adversarial networks. Phys. Rev. D 97(1), 014021 (2018)
Article Google Scholar
Patarasuk, P., Yuan, X.: Bandwidth optimal all-reduce algorithms for clusters of workstations. J. Parallel Distrib. Comput. 69(2), 117–124 (2009)
Article Google Scholar
Quintero, D.: IBM POWER8 high-performance computing guide: IBM power system S822LC (8335-GTB) edition. IBM Corporation, International Technical Support Organization, Poughkeepsie, NY (2017)
Google Scholar
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
Sergeev, A., Del Balso, M.: Horovod: fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799 (2018)
Vallecorsa, S.: Generative models for fast simulation. J. Phys.: Conf. Ser. 1085, 022005 (2018). https://doi.org/10.1088/1742-6596/1085/2/022005
Article Google Scholar
Vallecorsa, S., Moise, D., Carminati, F., Khattak, G.R.: Data-parallel training of generative adversarial networks on HPC systems for HEP simulations. In: 2018 IEEE 25th International Conference on High Performance Computing (HiPC), pp. 162–171, December 2018. https://doi.org/10.1109/HiPC.2018.00026
Vallecorsa, S., et al.: Distributed training of generative adversarial networks for fast detector simulation. In: Yokota, R., Weiland, M., Shalf, J., Alam, S. (eds.) ISC High Performance 2018. LNCS, vol. 11203, pp. 487–503. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02465-9_35
Chapter Google Scholar
Wale, D.: IBM PowerAI : Deep Learning Unleashed on IBM Power Systems Servers. IBM Redbooks, S.l (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Delft University of Technology, Delft, Netherlands
Ahmad Hesam
CERN, Geneva, Switzerland
Ahmad Hesam, Sofia Vallecorsa, Gulrukh Khattak & Federico Carminati

Authors

Ahmad Hesam
View author publications
You can also search for this author in PubMed Google Scholar
Sofia Vallecorsa
View author publications
You can also search for this author in PubMed Google Scholar
Gulrukh Khattak
View author publications
You can also search for this author in PubMed Google Scholar
Federico Carminati
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ahmad Hesam .

Editor information

Editors and Affiliations

University of Edinburgh, Edinburgh, UK
Michèle Weiland
Helmholtz-Zentrum Dresden-Rossendorf, Dresden, Sachsen, Germany
Guido Juckeland
Swiss National Supercomputing Centre, Lugano, Ticino, Switzerland
Sadaf Alam
University of Tennessee at Knoxville, Knoxville, TN, USA
Heike Jagode

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hesam, A., Vallecorsa, S., Khattak, G., Carminati, F. (2019). Evaluating POWER Architecture for Distributed Training of Generative Adversarial Networks. In: Weiland, M., Juckeland, G., Alam, S., Jagode, H. (eds) High Performance Computing. ISC High Performance 2019. Lecture Notes in Computer Science(), vol 11887. Springer, Cham. https://doi.org/10.1007/978-3-030-34356-9_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-34356-9_32
Published: 03 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34355-2
Online ISBN: 978-3-030-34356-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Evaluating POWER Architecture for Distributed Training of Generative Adversarial Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Stable parallel training of Wasserstein conditional generative adversarial neural networks

Parallel-Distributed Implementation of the Lipizzaner Framework for Multiobjective Coevolutionary Training of Generative Adversarial Networks

Large scale performance analysis of distributed deep learning frameworks for convolutional neural networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Evaluating POWER Architecture for Distributed Training of Generative Adversarial Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Stable parallel training of Wasserstein conditional generative adversarial neural networks

Parallel-Distributed Implementation of the Lipizzaner Framework for Multiobjective Coevolutionary Training of Generative Adversarial Networks

Large scale performance analysis of distributed deep learning frameworks for convolutional neural networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation