Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

FPGA-Accelerated Machine Learning Inference as a Service for Particle Physics Computing

  • Original Article
  • Published:
Computing and Software for Big Science Aims and scope Submit manuscript

Abstract

Large-scale particle physics experiments face challenging demands for high-throughput computing resources both now and in the future. New heterogeneous computing paradigms on dedicated hardware with increased parallelization, such as Field Programmable Gate Arrays (FPGAs), offer exciting solutions with large potential gains. The growing applications of machine learning algorithms in particle physics for simulation, reconstruction, and analysis are naturally deployed on such platforms. We demonstrate that the acceleration of machine learning inference as a web service represents a heterogeneous computing solution for particle physics experiments that potentially requires minimal modification to the current computing model. As examples, we retrain the ResNet-50 convolutional neural network to demonstrate state-of-the-art performance for top quark jet tagging at the LHC and apply a ResNet-50 model with transfer learning for neutrino event classification. Using Project Brainwave by Microsoft to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) ms with our experimental physics software framework using Brainwave as a cloud (edge or on-premises) service, representing an improvement by a factor of approximately 30 (175) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600–700 inferences per second using an image batch of one, comparable to large batch-size GPU throughput and significantly better than small batch-size GPU throughput. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. We refer synonymously to a cloud service being accessed remotely.

  2. We refer synonymously to a edge service being accessed on-premises, or on-prem.

  3. It takes significant effort to adapt TensorFlow to be compatible with the multithreading pattern used in CMSSW, and hence the latest version of TensorFlow is usually not available to be used in the experiment’s software.

  4. For that matter, CPU comparisons can also be nuanced when considering devices with many cores and large RAM. However, they do not fit in with the CMSSW computing model.

References

  1. Apollinari G, Béjar Alonso I, Brüning O, Lamont M, Rossi L (2015) High-luminosity large hadron collider (HL-LHC): preliminary design report. https://cds.cern.ch/record/2116337. Accessed Dec 2018

  2. HEP software foundation (2017) A roadmap for HEP software and computing R&D for the 2020s. arXiv:1712.06982

  3. Acciarri R, et al (2016) Long-baseline neutrino facility (LBNF) and deep underground neutrino experiment (DUNE). arXiv:1601.05471

  4. Mellema G et al (2013) Reionization and the cosmic dawn with the square kilometre array. Exp Astron 36:235. https://doi.org/10.1007/s10686-013-9334-5

    Article  ADS  Google Scholar 

  5. National Research Council, the future of computing performance: game over or next level? (2011). https://doi.org/10.17226/12980

  6. Acciarri R et al (2017) Convolutional neural networks applied to neutrino events in a liquid argon time projection chamber. JINST 12(03):P03011. https://doi.org/10.1088/1748-0221/12/03/P03011

    Article  ADS  Google Scholar 

  7. Aurisano A, Radovic A, Rocco D, Himmel A, Messier MD, Niner E, Pawloski G, Psihas F, Sousa A, Vahle P (2016) A convolutional neural network neutrino event classifier. JINST 11(09):P09001. https://doi.org/10.1088/1748-0221/11/09/P09001

    Article  ADS  Google Scholar 

  8. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. IEEE Confer Comput Vis Pattern Recog. https://doi.org/10.1109/CVPR.2016.90

  9. Chatrchyan S et al (2013) Energy Calibration and resolution of the CMS electromagnetic calorimeter in \(pp\) collisions at \(\sqrt{s} = 7\) TeV. JINST 8:P09009. https://doi.org/10.1088/1748-0221/8/09/P09009

    Article  Google Scholar 

  10. Nguyen TQ, Weitekamp D, Anderson D, Castello R, Cerri O, Pierini M, Spiropulu M, Vlimant JR (2018) Topology classification with deep learning to improve real-time event selection at the LHC. arXiv:1807.00083

  11. Chatrchyan S et al (2012) Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC. Phys Lett B 716:30. https://doi.org/10.1016/j.physletb.2012.08.021

    Article  ADS  Google Scholar 

  12. Aad G et al (2012) Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC. Phys Lett B 716:1. https://doi.org/10.1016/j.physletb.2012.08.020

    Article  ADS  Google Scholar 

  13. Duarte J et al (2018) Fast inference of deep neural networks in FPGAs for particle physics. JINST 13(07):P07027. https://doi.org/10.1088/1748-0221/13/07/P07027

    Article  Google Scholar 

  14. Low JF, Brinkerhoff AW, Busch EL, Carnes AM, Furic IK, Gleyzer S, Kotov K, Madorsky A, Rorie JT, Scurlock B, Shi W, Acosta DE (2017) Boosted decision trees in the level-1 muon endcap trigger at CMS, Tech. Rep. CMS-CR-2017-361, CERN, Geneva. https://cds.cern.ch/record/2289251. Accessed July 2018

  15. Kasieczka G, Michael R, Tilman P (2017) Top tagging reference dataset. https://goo.gl/XGYju3. Accessed July 2018

  16. Ayres DS et al (2007) The NOvA technical design report. https://doi.org/10.2172/935497

  17. Caulfield A, Chung E, Putnam A, Angepat H, Fowers J, Haselman M, Heil S, Humphrey M, Kaur P, Kim JY, Lo D, Massengill T, Ovtcharov K, Papamichael M, Woods L, Lanka S, Chiou D, Burger D (2016) A cloud-scale acceleration architecture. IEEE Comput Soc. https://www.microsoft.com/en-us/research/publication/configurable-cloud-acceleration/. Accessed Oct 2017

  18. CMS Collaboration (2015) Technical Proposal for the Phase-II Upgrade of the compact muon solenoid. CMS Technical Proposal CERN-LHCC-2015-010, CMS-TDR-15-02. https://cds.cern.ch/record/2020886. Accessed July 2017

  19. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  20. Huang G, Liu Z, Weinberger KQ (2017) Densely connected convolutional networks. 2017 IEEE Confer Comput Vis Pattern Recogn. https://doi.org/10.1109/CVPR.2017.243

  21. Xilinx (2018) Xilinx ML Suite. https://github.com/Xilinx/ml-suite. Accessed Sept 2018

  22. Tensorflow (2018) Using TPUs. https://www.tensorflow.org/guide/using_tpu. Accessed Sept 2018

  23. Intel (2018) Intel distribution of OpenVINO toolkit. https://software.intel.com/en-us/openvino-toolkit. Accessed Sept 2018

  24. Kasieczka G et al (2019) The machine learning landscape of top taggers. arXiv:1902.09914

  25. Butter A, Kasieczka G, Plehn T, Russell M (2018) Deep-learned top tagging with a lorentz layer. Sci Post Phys 5(3):028. https://doi.org/10.21468/SciPostPhys.5.3.028

    Article  ADS  Google Scholar 

  26. Sjöstrand T, Ask S, Christiansen JR, Corke R, Desai N, Ilten P, Mrenna S, Prestel S, Rasmussen CO, Skands PZ (2015) An introduction to PYTHIA 8.2. Comput Phys Commun 191:159. https://doi.org/10.1016/j.cpc.2015.01.024

    Article  ADS  MATH  Google Scholar 

  27. Skands P, Carrazza S, Rojo J (2014) Tuning PYTHIA 8.1: the Monash 2013 Tune. Eur Phys J C 74(8):3024. https://doi.org/10.1140/epjc/s10052-014-3024-y

    Article  ADS  Google Scholar 

  28. de Favereau J, Delaere C, Demin P, Giammanco A, Lematre V, Mertens A, Selvaggi M (2014) DELPHES 3. A modular framework for fast simulation of a generic collider experiment. JHEP 02:057. https://doi.org/10.1007/JHEP02(2014)057

    Article  Google Scholar 

  29. Cacciari M, Salam GP, Soyez G (2012) FastJet user manual. Eur Phys J C 72:1896. https://doi.org/10.1140/epjc/s10052-012-1896-2

    Article  ADS  MATH  Google Scholar 

  30. Cacciari M, Salam GP (2006) Dispelling the \(N^{3}\) myth for the \(k_t\) jet-finder. Phys Lett B 641:57. https://doi.org/10.1016/j.physletb.2006.08.037

    Article  ADS  Google Scholar 

  31. Cacciari M, Salam GP, Soyez G (2008) The anti-\(k_t\) jet clustering algorithm. JHEP 04:063. https://doi.org/10.1088/1126-6708/2008/04/063

    Article  ADS  MATH  Google Scholar 

  32. Qu H, Gouskos L (2019) ParticleNet: jet tagging via particle clouds. arXiv:1902.08570

  33. Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. Proc ICML 27:807–814

    Google Scholar 

  34. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. https://dblp.org/rec/bib/journals/corr/KingmaB14. Accessed July 2018

  35. Adamson P et al (2017) Constraints on oscillation parameters from \(\nu _e\) appearance and \(\nu _\mu \) disappearance in NOvA. Phys Rev Lett 118(23):231801. https://doi.org/10.1103/PhysRevLett.118.231801

    Article  ADS  Google Scholar 

  36. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: CVPR09. http://www.image-net.org/papers/imagenet_cvpr09.bib. Accessed June 2018

  37. Private communicates with Alex Himmel (2018), October 2018

  38. Radovic A, Williams M, Rousseau D, Kagan M, Bonacorsi D, Himmel A, Aurisano A, Terao K, Wongjirad T (2018) Machine learning at the energy and intensity frontiers of particle physics. Nature 560(7716):41. https://doi.org/10.1038/s41586-018-0361-2

    Article  ADS  Google Scholar 

  39. Albertsson K et al (2018) Machine learning in high energy physics community white paper. J Phys Confer Ser 1085(2):022008. https://doi.org/10.1088/1742-6596/1085/2/022008

    Article  Google Scholar 

  40. Farrell S, Anderson D, Calafiura P, Cerati G, Gray L, Kowalkowski J, Mudigonda M, Prabhat, P. Spentzouris, Spiropoulou M, Tsaris A, Vlimant JR, Zheng S (2017) The HEP.TrkX Project: deep neural networks for HL-LHC online and offline tracking. EPJ Web Confer 150:00003. https://doi.org/10.1051/epjconf/201715000003

    Article  Google Scholar 

  41. CERN (2018) TrackML particle tracking challenge. https://www.kaggle.com/c/trackml-particle-identification. Accessed July 2018

  42. Paganini M, de Oliveira L, Nachman B (2018) CaloGAN: simulating 3D high energy particle showers in multilayer electromagnetic calorimeters with generative adversarial networks. Phys Rev D 97(1):014021. https://doi.org/10.1103/PhysRevD.97.014021

    Article  ADS  Google Scholar 

  43. Google (2018) gRPC. version v1.14.0. https://grpc.io/. Accessed Sept 2018

  44. Google (2019) Protocol buffers. https://github.com/protocolbuffers/protobuf. Accessed Sept 2018

  45. CMS Collaboration (2018) CMSSW. version CMSSW\_10\_2\_0. https://github.com/cms-sw/cmssw. Accessed Sept 2018

  46. Intel (2018) Thread building blocks. version 2018\_U1. https://www.threadingbuildingblocks.org. Accessed Sept 2018

  47. Pedro K (2019) SonicCMS. version v3.1.0. https://github.com/hls-fpga-machine-learning/SonicCMS. Accessed Sept 2018

Download references

Acknowledgements

We would like to thank the entire Microsoft Azure Machine Learning, Bing, and Project Brainwave teams for the development of and opportunity to preview and study the acceleration platform. In particular, we would like to acknowledge Doug Burger, Eric Chung, Jeremy Fowers, Daniel Lo, Kalin Ovtcharov, and Andrew Putnam, for their support and enthusiasm. We would like to thank Lothar Bauerdick and Oliver Gutsche for seed funding through USCMS computing operations. We would like to thank Alex Himmel and other NOvA collaborators for support and comments on the manuscript. Part of this work was conducted at “iBanks,” the AI GPU cluster at Caltech. We acknowledge NVIDIA, SuperMicro, and the Kavli Foundation for their support of “iBanks.” Part of this work was conducted using Google Cloud resources provided by the MIT Quest for Intelligence program. Part of this work is supported through IRIS-HEP under NSF-grant 1836650. We thank the organizers of the public available top tagging dataset (and others like it) for providing benchmarks for the physics community. The authors thank the NOvA collaboration for the use of its Monte Carlo software tools and data and for the review of this manuscript. This work was supported by the US Department of Energy and the US National Science Foundation. NOvA receives additional support from the Department of Science and Technology, India; the European Research Council; the MSMT CR, Czech Republic; the RAS, RMES, and RFBR, Russia; CNPq and FAPEG, Brazil; and the State and University of Minnesota. We are grateful for the contributions of the staff at the Ash River Laboratory, Argonne National Laboratory, and Fermilab.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nhan Tran.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

J.D., B.H., S.J., B.K., M.L., K.P., N.T., and A.T. are supported by Fermi Research Alliance, LLC under Contract No. DE-AC02-07CH11359 with the U.S. Department of Energy, Office of Science, Office of High Energy Physics. P.H. and D.R. are supported by a Massachusetts Institute of Technology University grant. M.P., J.N., and V.L. received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant agreement no 772369). V.L. also received funding from the Ministry of Education, Science, and Technological Development of the Republic of Serbia under project ON171017. S-C.H. is supported by DOE Office of Science, Office of High Energy Physics Early Career Research program under Award No. DE-SC0015971. S.H., M.T., and D.W. are supported by F5 Networks. Z. W. is supported by the National Science Foundation under Grants No. 1606321 and 115164.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Duarte, J., Harris, P., Hauck, S. et al. FPGA-Accelerated Machine Learning Inference as a Service for Particle Physics Computing. Comput Softw Big Sci 3, 13 (2019). https://doi.org/10.1007/s41781-019-0027-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41781-019-0027-2

Keywords

Navigation