FPGA-Accelerated Machine Learning Inference as a Service for Particle Physics Computing

Javier Duarte¹,
Philip Harris²,
Scott Hauck³,
Burt Holzman¹,
Shih-Chieh Hsu³,
Sergo Jindariani¹,
Suffian Khan⁴,
Benjamin Kreis¹,
Brian Lee⁴,
Mia Liu¹,
Vladimir Lončar^5,6,
Jennifer Ngadiuba⁵,
Kevin Pedro¹,
Brandon Perez⁴,
Maurizio Pierini⁵,
Dylan Rankin²,
Nhan Tran ORCID: orcid.org/0000-0002-8440-6854¹,
Matthew Trahms³,
Aristeidis Tsaris¹,
Colin Versteeg⁴,
Ted W. Way⁴,
Dustin Werran³ &
…
Zhenbin Wu⁷

1248 Accesses
40 Citations
34 Altmetric
3 Mentions
Explore all metrics

Abstract

Large-scale particle physics experiments face challenging demands for high-throughput computing resources both now and in the future. New heterogeneous computing paradigms on dedicated hardware with increased parallelization, such as Field Programmable Gate Arrays (FPGAs), offer exciting solutions with large potential gains. The growing applications of machine learning algorithms in particle physics for simulation, reconstruction, and analysis are naturally deployed on such platforms. We demonstrate that the acceleration of machine learning inference as a web service represents a heterogeneous computing solution for particle physics experiments that potentially requires minimal modification to the current computing model. As examples, we retrain the ResNet-50 convolutional neural network to demonstrate state-of-the-art performance for top quark jet tagging at the LHC and apply a ResNet-50 model with transfer learning for neutrino event classification. Using Project Brainwave by Microsoft to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) ms with our experimental physics software framework using Brainwave as a cloud (edge or on-premises) service, representing an improvement by a factor of approximately 30 (175) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600–700 inferences per second using an image batch of one, comparable to large batch-size GPU throughput and significantly better than small batch-size GPU throughput. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Implementation of a Framework for Deploying AI Inference Engines in FPGAs

TPCx-AI on NVIDIA Jetsons

Portable Acceleration of CMS Computing Workflows with Coprocessors as a Service

Article Open access 04 September 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Experimental Particle Physics

Notes

We refer synonymously to a cloud service being accessed remotely.
We refer synonymously to a edge service being accessed on-premises, or on-prem.
It takes significant effort to adapt TensorFlow to be compatible with the multithreading pattern used in CMSSW, and hence the latest version of TensorFlow is usually not available to be used in the experiment’s software.
For that matter, CPU comparisons can also be nuanced when considering devices with many cores and large RAM. However, they do not fit in with the CMSSW computing model.

References

Apollinari G, Béjar Alonso I, Brüning O, Lamont M, Rossi L (2015) High-luminosity large hadron collider (HL-LHC): preliminary design report. https://cds.cern.ch/record/2116337. Accessed Dec 2018
HEP software foundation (2017) A roadmap for HEP software and computing R&D for the 2020s. arXiv:1712.06982
Acciarri R, et al (2016) Long-baseline neutrino facility (LBNF) and deep underground neutrino experiment (DUNE). arXiv:1601.05471
Mellema G et al (2013) Reionization and the cosmic dawn with the square kilometre array. Exp Astron 36:235. https://doi.org/10.1007/s10686-013-9334-5
Article ADS Google Scholar
National Research Council, the future of computing performance: game over or next level? (2011). https://doi.org/10.17226/12980
Acciarri R et al (2017) Convolutional neural networks applied to neutrino events in a liquid argon time projection chamber. JINST 12(03):P03011. https://doi.org/10.1088/1748-0221/12/03/P03011
Article ADS Google Scholar
Aurisano A, Radovic A, Rocco D, Himmel A, Messier MD, Niner E, Pawloski G, Psihas F, Sousa A, Vahle P (2016) A convolutional neural network neutrino event classifier. JINST 11(09):P09001. https://doi.org/10.1088/1748-0221/11/09/P09001
Article ADS Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. IEEE Confer Comput Vis Pattern Recog. https://doi.org/10.1109/CVPR.2016.90
Chatrchyan S et al (2013) Energy Calibration and resolution of the CMS electromagnetic calorimeter in $pp$ collisions at $\sqrt{s} = 7$ TeV. JINST 8:P09009. https://doi.org/10.1088/1748-0221/8/09/P09009
Article Google Scholar
Nguyen TQ, Weitekamp D, Anderson D, Castello R, Cerri O, Pierini M, Spiropulu M, Vlimant JR (2018) Topology classification with deep learning to improve real-time event selection at the LHC. arXiv:1807.00083
Chatrchyan S et al (2012) Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC. Phys Lett B 716:30. https://doi.org/10.1016/j.physletb.2012.08.021
Article ADS Google Scholar
Aad G et al (2012) Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC. Phys Lett B 716:1. https://doi.org/10.1016/j.physletb.2012.08.020
Article ADS Google Scholar
Duarte J et al (2018) Fast inference of deep neural networks in FPGAs for particle physics. JINST 13(07):P07027. https://doi.org/10.1088/1748-0221/13/07/P07027
Article Google Scholar
Low JF, Brinkerhoff AW, Busch EL, Carnes AM, Furic IK, Gleyzer S, Kotov K, Madorsky A, Rorie JT, Scurlock B, Shi W, Acosta DE (2017) Boosted decision trees in the level-1 muon endcap trigger at CMS, Tech. Rep. CMS-CR-2017-361, CERN, Geneva. https://cds.cern.ch/record/2289251. Accessed July 2018
Kasieczka G, Michael R, Tilman P (2017) Top tagging reference dataset. https://goo.gl/XGYju3. Accessed July 2018
Ayres DS et al (2007) The NOvA technical design report. https://doi.org/10.2172/935497
Caulfield A, Chung E, Putnam A, Angepat H, Fowers J, Haselman M, Heil S, Humphrey M, Kaur P, Kim JY, Lo D, Massengill T, Ovtcharov K, Papamichael M, Woods L, Lanka S, Chiou D, Burger D (2016) A cloud-scale acceleration architecture. IEEE Comput Soc. https://www.microsoft.com/en-us/research/publication/configurable-cloud-acceleration/. Accessed Oct 2017
CMS Collaboration (2015) Technical Proposal for the Phase-II Upgrade of the compact muon solenoid. CMS Technical Proposal CERN-LHCC-2015-010, CMS-TDR-15-02. https://cds.cern.ch/record/2020886. Accessed July 2017
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Huang G, Liu Z, Weinberger KQ (2017) Densely connected convolutional networks. 2017 IEEE Confer Comput Vis Pattern Recogn. https://doi.org/10.1109/CVPR.2017.243
Xilinx (2018) Xilinx ML Suite. https://github.com/Xilinx/ml-suite. Accessed Sept 2018
Tensorflow (2018) Using TPUs. https://www.tensorflow.org/guide/using_tpu. Accessed Sept 2018
Intel (2018) Intel distribution of OpenVINO toolkit. https://software.intel.com/en-us/openvino-toolkit. Accessed Sept 2018
Kasieczka G et al (2019) The machine learning landscape of top taggers. arXiv:1902.09914
Butter A, Kasieczka G, Plehn T, Russell M (2018) Deep-learned top tagging with a lorentz layer. Sci Post Phys 5(3):028. https://doi.org/10.21468/SciPostPhys.5.3.028
Article ADS Google Scholar
Sjöstrand T, Ask S, Christiansen JR, Corke R, Desai N, Ilten P, Mrenna S, Prestel S, Rasmussen CO, Skands PZ (2015) An introduction to PYTHIA 8.2. Comput Phys Commun 191:159. https://doi.org/10.1016/j.cpc.2015.01.024
Article ADS MATH Google Scholar
Skands P, Carrazza S, Rojo J (2014) Tuning PYTHIA 8.1: the Monash 2013 Tune. Eur Phys J C 74(8):3024. https://doi.org/10.1140/epjc/s10052-014-3024-y
Article ADS Google Scholar
de Favereau J, Delaere C, Demin P, Giammanco A, Lematre V, Mertens A, Selvaggi M (2014) DELPHES 3. A modular framework for fast simulation of a generic collider experiment. JHEP 02:057. https://doi.org/10.1007/JHEP02(2014)057
Article Google Scholar
Cacciari M, Salam GP, Soyez G (2012) FastJet user manual. Eur Phys J C 72:1896. https://doi.org/10.1140/epjc/s10052-012-1896-2
Article ADS MATH Google Scholar
Cacciari M, Salam GP (2006) Dispelling the $N^{3}$ myth for the $k_t$ jet-finder. Phys Lett B 641:57. https://doi.org/10.1016/j.physletb.2006.08.037
Article ADS Google Scholar
Cacciari M, Salam GP, Soyez G (2008) The anti-$k_t$ jet clustering algorithm. JHEP 04:063. https://doi.org/10.1088/1126-6708/2008/04/063
Article ADS MATH Google Scholar
Qu H, Gouskos L (2019) ParticleNet: jet tagging via particle clouds. arXiv:1902.08570
Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. Proc ICML 27:807–814
Google Scholar
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. https://dblp.org/rec/bib/journals/corr/KingmaB14. Accessed July 2018
Adamson P et al (2017) Constraints on oscillation parameters from $\nu _e$ appearance and $\nu _\mu $ disappearance in NOvA. Phys Rev Lett 118(23):231801. https://doi.org/10.1103/PhysRevLett.118.231801
Article ADS Google Scholar
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: CVPR09. http://www.image-net.org/papers/imagenet_cvpr09.bib. Accessed June 2018
Private communicates with Alex Himmel (2018), October 2018
Radovic A, Williams M, Rousseau D, Kagan M, Bonacorsi D, Himmel A, Aurisano A, Terao K, Wongjirad T (2018) Machine learning at the energy and intensity frontiers of particle physics. Nature 560(7716):41. https://doi.org/10.1038/s41586-018-0361-2
Article ADS Google Scholar
Albertsson K et al (2018) Machine learning in high energy physics community white paper. J Phys Confer Ser 1085(2):022008. https://doi.org/10.1088/1742-6596/1085/2/022008
Article Google Scholar
Farrell S, Anderson D, Calafiura P, Cerati G, Gray L, Kowalkowski J, Mudigonda M, Prabhat, P. Spentzouris, Spiropoulou M, Tsaris A, Vlimant JR, Zheng S (2017) The HEP.TrkX Project: deep neural networks for HL-LHC online and offline tracking. EPJ Web Confer 150:00003. https://doi.org/10.1051/epjconf/201715000003
Article Google Scholar
CERN (2018) TrackML particle tracking challenge. https://www.kaggle.com/c/trackml-particle-identification. Accessed July 2018
Paganini M, de Oliveira L, Nachman B (2018) CaloGAN: simulating 3D high energy particle showers in multilayer electromagnetic calorimeters with generative adversarial networks. Phys Rev D 97(1):014021. https://doi.org/10.1103/PhysRevD.97.014021
Article ADS Google Scholar
Google (2018) gRPC. version v1.14.0. https://grpc.io/. Accessed Sept 2018
Google (2019) Protocol buffers. https://github.com/protocolbuffers/protobuf. Accessed Sept 2018
CMS Collaboration (2018) CMSSW. version CMSSW\_10\_2\_0. https://github.com/cms-sw/cmssw. Accessed Sept 2018
Intel (2018) Thread building blocks. version 2018\_U1. https://www.threadingbuildingblocks.org. Accessed Sept 2018
Pedro K (2019) SonicCMS. version v3.1.0. https://github.com/hls-fpga-machine-learning/SonicCMS. Accessed Sept 2018

Download references

Acknowledgements

We would like to thank the entire Microsoft Azure Machine Learning, Bing, and Project Brainwave teams for the development of and opportunity to preview and study the acceleration platform. In particular, we would like to acknowledge Doug Burger, Eric Chung, Jeremy Fowers, Daniel Lo, Kalin Ovtcharov, and Andrew Putnam, for their support and enthusiasm. We would like to thank Lothar Bauerdick and Oliver Gutsche for seed funding through USCMS computing operations. We would like to thank Alex Himmel and other NOvA collaborators for support and comments on the manuscript. Part of this work was conducted at “iBanks,” the AI GPU cluster at Caltech. We acknowledge NVIDIA, SuperMicro, and the Kavli Foundation for their support of “iBanks.” Part of this work was conducted using Google Cloud resources provided by the MIT Quest for Intelligence program. Part of this work is supported through IRIS-HEP under NSF-grant 1836650. We thank the organizers of the public available top tagging dataset (and others like it) for providing benchmarks for the physics community. The authors thank the NOvA collaboration for the use of its Monte Carlo software tools and data and for the review of this manuscript. This work was supported by the US Department of Energy and the US National Science Foundation. NOvA receives additional support from the Department of Science and Technology, India; the European Research Council; the MSMT CR, Czech Republic; the RAS, RMES, and RFBR, Russia; CNPq and FAPEG, Brazil; and the State and University of Minnesota. We are grateful for the contributions of the staff at the Ash River Laboratory, Argonne National Laboratory, and Fermilab.

Author information

Authors and Affiliations

Fermi National Accelerator Laboratory, Batavia, IL, 60510, USA
Javier Duarte, Burt Holzman, Sergo Jindariani, Benjamin Kreis, Mia Liu, Kevin Pedro, Nhan Tran & Aristeidis Tsaris
Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
Philip Harris & Dylan Rankin
University of Washington, Seattle, WA, 98195, USA
Scott Hauck, Shih-Chieh Hsu, Matthew Trahms & Dustin Werran
Microsoft, Redmond, WA, 98052, USA
Suffian Khan, Brian Lee, Brandon Perez, Colin Versteeg & Ted W. Way
CERN, CH-1211, Geneva 23, Switzerland
Vladimir Lončar, Jennifer Ngadiuba & Maurizio Pierini
Institute of Physics Belgrade, University of Belgrade, Belgrade, Serbia
Vladimir Lončar
University of Illinois at Chicago, Chicago, IL, 60607, USA
Zhenbin Wu

Authors

Javier Duarte
View author publications
You can also search for this author in PubMed Google Scholar
Philip Harris
View author publications
You can also search for this author in PubMed Google Scholar
Scott Hauck
View author publications
You can also search for this author in PubMed Google Scholar
Burt Holzman
View author publications
You can also search for this author in PubMed Google Scholar
Shih-Chieh Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Sergo Jindariani
View author publications
You can also search for this author in PubMed Google Scholar
Suffian Khan
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Kreis
View author publications
You can also search for this author in PubMed Google Scholar
Brian Lee
View author publications
You can also search for this author in PubMed Google Scholar
Mia Liu
View author publications
You can also search for this author in PubMed Google Scholar
Vladimir Lončar
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer Ngadiuba
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Pedro
View author publications
You can also search for this author in PubMed Google Scholar
Brandon Perez
View author publications
You can also search for this author in PubMed Google Scholar
Maurizio Pierini
View author publications
You can also search for this author in PubMed Google Scholar
Dylan Rankin
View author publications
You can also search for this author in PubMed Google Scholar
Nhan Tran
View author publications
You can also search for this author in PubMed Google Scholar
Matthew Trahms
View author publications
You can also search for this author in PubMed Google Scholar
Aristeidis Tsaris
View author publications
You can also search for this author in PubMed Google Scholar
Colin Versteeg
View author publications
You can also search for this author in PubMed Google Scholar
Ted W. Way
View author publications
You can also search for this author in PubMed Google Scholar
Dustin Werran
View author publications
You can also search for this author in PubMed Google Scholar
Zhenbin Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nhan Tran.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

J.D., B.H., S.J., B.K., M.L., K.P., N.T., and A.T. are supported by Fermi Research Alliance, LLC under Contract No. DE-AC02-07CH11359 with the U.S. Department of Energy, Office of Science, Office of High Energy Physics. P.H. and D.R. are supported by a Massachusetts Institute of Technology University grant. M.P., J.N., and V.L. received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant agreement no 772369). V.L. also received funding from the Ministry of Education, Science, and Technological Development of the Republic of Serbia under project ON171017. S-C.H. is supported by DOE Office of Science, Office of High Energy Physics Early Career Research program under Award No. DE-SC0015971. S.H., M.T., and D.W. are supported by F5 Networks. Z. W. is supported by the National Science Foundation under Grants No. 1606321 and 115164.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Duarte, J., Harris, P., Hauck, S. et al. FPGA-Accelerated Machine Learning Inference as a Service for Particle Physics Computing. Comput Softw Big Sci 3, 13 (2019). https://doi.org/10.1007/s41781-019-0027-2

Download citation

Received: 29 April 2019
Accepted: 20 August 2019
Published: 14 October 2019
DOI: https://doi.org/10.1007/s41781-019-0027-2

FPGA-Accelerated Machine Learning Inference as a Service for Particle Physics Computing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Implementation of a Framework for Deploying AI Inference Engines in FPGAs

TPCx-AI on NVIDIA Jetsons

Portable Acceleration of CMS Computing Workflows with Coprocessors as a Service

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

FPGA-Accelerated Machine Learning Inference as a Service for Particle Physics Computing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Implementation of a Framework for Deploying AI Inference Engines in FPGAs

TPCx-AI on NVIDIA Jetsons

Portable Acceleration of CMS Computing Workflows with Coprocessors as a Service

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation