Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3229591.3229594acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Free access

Can the Network be the AI Accelerator?

Published: 07 August 2018 Publication History

Abstract

Artificial Neural Networks (NNs) play an increasingly important role in many services and applications, contributing significantly to compute infrastructures' workloads. When used in latency sensitive services, NNs are usually processed by CPUs since using an external dedicated hardware accelerator would be inefficient. However, with growing workloads size and complexity, CPUs are hitting their computation limits, requiring the introduction of new specialized hardware accelerators tailored to the task. In this paper we analyze the option to use programmable network devices, such as Network Cards and Switches, as NN accelerators in place of purpose built dedicated hardware. To this end, in this preliminary work we analyze in depth the properties of NN processing on CPUs, derive options to efficiently split such processing, and show that programmable network devices may be a suitable engine for implementing a CPU's NN co-processor.

References

[1]
Pat Bosshart, Dan Daly, Glen Gibb, Martin Izzard, Nick McKeown, Jennifer Rexford, Cole Schlesinger, Dan Talayco, Amin Vahdat, George Varghese, et al. 2014. P4: Programming protocol-independent packet processors. ACM SIGCOMM CCR 44, 3 (2014), 87--95.
[2]
Pat Bosshart, Glen Gibb, Hun-Seok Kim, George Varghese, Nick McKeown, Martin Izzard, Fernando Mujica, and Mark Horowitz. 2013. Forwarding Metamorphosis: Fast Programmable Match-action Processing in Hardware for SDN. In Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM (SIGCOMM '13). ACM, New York, NY, USA, 99--110.
[3]
Huynh Tu Dang, Marco Canini, Fernando Pedone, and Robert Soulé. 2016. Paxos Made Switch-y. SIGCOMM Comput. Commun. Rev. 46, 2 (May 2016), 18--24.
[4]
Nikos Hardavellas. 2012. The rise and fall of dark silicon. USENIX;login: 37 (2012), 7--17.
[5]
Johann Hauswald, Yiping Kang, Michael A Laurenzano, Quan Chen, Cheng Li, Trevor Mudge, Ronald G Dreslinski, Jason Mars, and Lingjia Tang. 2015. DjiNN and Tonic: DNN as a service and its implications for future warehouse scale computers. In ACM SIGARCH Computer Architecture News, Vol. 43. ACM, 27--40.
[6]
K. Hazelwood, S. Bird, D. Brooks, S. Chintala, U. Diril, D. Dzhulgakov, M. Fawzy, B. Jia, Y. Jia, A. Kalro, J. Law, K. Lee, J. Lu, P. Noordhuis, M. Smelyanskiy, L. Xiong, and X. Wang. 2018. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). 620--629.
[7]
Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized Neural Networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS'16). Curran Associates Inc., USA, 4114--4122.
[8]
Xin Jin, Xiaozhou Li, Haoyu Zhang, Robert Soulé, Jeongkeun Lee, Nate Foster, Changhoon Kim, and Ion Stoica. 2017. NetCache: Balancing Key-Value Stores with Fast In-Network Caching. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17). ACM, New York, NY, USA, 121--136.
[9]
Norman P Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, et al. 2017. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th Annual International Symposium on Computer Architecture. ACM, 1--12.
[10]
Vadim Karpusenko, Andres Rodriguez, Jacek Czaja, and Mariusz Moczala. 2016. Caffe* Optimized for Intel Architecture: Applying Modern Code Techniques. Technical Report. Intel.
[11]
Minje Kim and Paris Smaragdis. 2016. Bitwise Neural Networks. CoRR abs/1601.06071 (2016). arXiv:1601.06071 http://arxiv.org/abs/1601.06071
[12]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.
[13]
Xiaozhou Li, Raghav Sethi, Michael Kaminsky, David G. Andersen, and Michael J. Freedman. 2016. Be Fast, Cheap and in Control with SwitchKV. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16). USENIX Association, Santa Clara, CA, 31--44. https://www.usenix.org/conference/nsdi16/technical-sessions/presentation/li-xiaozhou
[14]
Ming Liu, Liang Luo, Jacob Nelson, Luis Ceze, Arvind Krishnamurthy, and Kishore Atreya. 2017. IncBricks: Toward In-Network Computation with an In-Network Cache. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 795--809.
[15]
Rui Miao, Hongyi Zeng, Changhoon Kim, Jeongkeun Lee, and Minlan Yu. 2017. SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM '17). ACM, New York, NY, USA, 15--28.
[16]
Microsoft. 2017. Microsoft unveils Project Brainwave for realtime AI. https://www.microsoft.com/en-us/research/blog/microsoft-unveils-project-brainwave/
[17]
Daisuke Miyashita, Edward H Lee, and Boris Murmann. 2016. Convolutional neural networks using logarithmic data representation. arXiv preprint arXiv:1603.01025 (2016).
[18]
Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. CoRR abs/1603.05279 (2016). arXiv:1603.05279 http://arxiv.org/abs/1603.05279
[19]
Amedeo Sapio, Ibrahim Abdelaziz, Abdulla Aldilaijan, Marco Canini, and Panos Kalnis. 2017. In-Network Computation is a Dumb Idea Whose Time Has Come. In Proceedings of the 16th ACM Workshop on Hot Topics in Networks, Palo Alto, CA, USA, HotNets 2017, November 30 - December 01, 2017. 150--156.
[20]
Naveen Kr. Sharma, Antoine Kaufmann, Thomas Anderson, Arvind Krishnamurthy, Jacob Nelson, and Simon Peter. 2017. Evaluating the Power of Flexible Packet Processing for Network Resource Allocation. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). USENIX Association, Boston, MA, 67--82. https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/sharma
[21]
Giuseppe Siracusano and Roberto Bifulco. 2018. In-network Neural Networks. arXiv preprint arXiv:1801.05731 (2018).

Cited By

View all
  • (2024)Planter: Rapid Prototyping of In-Network Machine Learning InferenceACM SIGCOMM Computer Communication Review10.1145/3687230.368723254:1(2-21)Online publication date: 6-Aug-2024
  • (2024)Native Support of AI Applications in 6G Mobile Networks Via an Intelligent User Plane2024 IEEE Wireless Communications and Networking Conference (WCNC)10.1109/WCNC57260.2024.10570691(1-6)Online publication date: 21-Apr-2024
  • (2024)XNetIoT: An Extreme Quantized Neural Network Architecture for IoT Environment Using P4IEEE Transactions on Network and Service Management10.1109/TNSM.2024.342331721:5(5756-5767)Online publication date: Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
NetCompute '18: Proceedings of the 2018 Morning Workshop on In-Network Computing
August 2018
44 pages
ISBN:9781450359085
DOI:10.1145/3229591
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 August 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Computation offloading
  2. Programmable switches
  3. SmartNIC

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

SIGCOMM '18
Sponsor:
SIGCOMM '18: ACM SIGCOMM 2018 Conference
August 20, 2018
Budapest, Hungary

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)572
  • Downloads (Last 6 weeks)43
Reflects downloads up to 16 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Planter: Rapid Prototyping of In-Network Machine Learning InferenceACM SIGCOMM Computer Communication Review10.1145/3687230.368723254:1(2-21)Online publication date: 6-Aug-2024
  • (2024)Native Support of AI Applications in 6G Mobile Networks Via an Intelligent User Plane2024 IEEE Wireless Communications and Networking Conference (WCNC)10.1109/WCNC57260.2024.10570691(1-6)Online publication date: 21-Apr-2024
  • (2024)XNetIoT: An Extreme Quantized Neural Network Architecture for IoT Environment Using P4IEEE Transactions on Network and Service Management10.1109/TNSM.2024.342331721:5(5756-5767)Online publication date: Oct-2024
  • (2024) Marina : Realizing ML-Driven Real-Time Network Traffic Monitoring at Terabit Scale IEEE Transactions on Network and Service Management10.1109/TNSM.2024.338239321:3(2773-2790)Online publication date: Jun-2024
  • (2024)ProFi: Scalable and Efficient Website FingerprintingIEEE Transactions on Network and Service Management10.1109/TNSM.2023.331850821:1(1271-1286)Online publication date: Feb-2024
  • (2024)IIsy: Hybrid In-Network Classification Using Programmable SwitchesIEEE/ACM Transactions on Networking10.1109/TNET.2024.336475732:3(2555-2570)Online publication date: Jun-2024
  • (2024)Enabling Programmable Data Planes with C++ and High-Level Synthesis for Custom Packet Forwarding2024 37th SBC/SBMicro/IEEE Symposium on Integrated Circuits and Systems Design (SBCCI)10.1109/SBCCI62366.2024.10704008(1-5)Online publication date: 2-Sep-2024
  • (2024)Spinner: Enabling In-network Flow Clustering Entirely in a Programmable Data PlaneNOMS 2024-2024 IEEE Network Operations and Management Symposium10.1109/NOMS59830.2024.10575413(1-9)Online publication date: 6-May-2024
  • (2024)RIDS: Towards Advanced IDS via RNN Model and Programmable Switches Co-Designed ApproachesIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621290(591-600)Online publication date: 20-May-2024
  • (2024)In-Network Machine Learning Using Programmable Network Devices: A SurveyIEEE Communications Surveys & Tutorials10.1109/COMST.2023.334435126:2(1171-1200)Online publication date: Oct-2025
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media