research-article

High performance in the cloud with FPGA groups

Authors:

Guillaume Pierre,

Jose Gabriel de F. Coutinho,

Mark StillwellAuthors Info & Claims

UCC '16: Proceedings of the 9th International Conference on Utility and Cloud Computing

Pages 1 - 10

https://doi.org/10.1145/2996890.2996895

Published: 06 December 2016 Publication History

Abstract

Field-programmable gate arrays (FPGAs) can offer invaluable computational performance for many compute-intensive algorithms. However, to justify their purchase and administration costs it is necessary to maximize resource utilization over their expected lifetime. Making FPGAs available in a cloud environment would make them attractive to new types of users and applications and help democratize this increasingly popular technology. However, there currently exists no satisfactory technique for offering FPGAs as cloud resources and sharing them between multiple tenants. We propose FPGA groups, which are seen by their clients as a single virtual FPGA, and which aggregate the computational power of multiple physical FPGAs. FPGA groups are elastic, and they may be shared among multiple tenants. We present an autoscaling algorithm to maximize FPGA groups' resource utilization and reduce user-perceived computation latencies. FPGA groups incur a low overhead in the order of 0.09 ms per submitted task. When faced with a challenging workload, the autoscaling algorithm increases resource utilization from 52% to 61% compared to a static resource allocation, while reducing task execution latencies by 61%.

References

[1]

Amazon Web Services. Amazon Machine Learning. http://aws.amazon.com/aml/.

[2]

Amazon Web Services. EC2: Elastic Compute Cloud. http://aws.amazon.com/ec2/.

[3]

J. Arram, W. Luk, and P. Jiang. Ramethy: Reconfigurable acceleration of bisulfite sequence alignment. In Proc. ACM/SIGDA Intl. Symposium on Field-Programmable Gate Arrays, 2015.

Digital Library

[4]

D. F. Bacon, R. Rabbah, and S. Shukla. FPGA programming for the masses. ACM Queue, 11(2), 2013.

Digital Library

[5]

A. Barak and A. Shiloh. The VirtualCL (VCL) cluster platform. http://www.mosix.org/txt_vcl.html.

[6]

S. Byma, J. G. Steffan, H. Bannazadeh, A. L. Garcia, and P. Chow. FPGAs in the cloud: Booting virtualized hardware accelerators with OpenStack. In Proc. FCCM, 2014.

Digital Library

[7]

J. M. Cardoso and P. C. Diniz. Compilation Techniques for Reconfigurable Architectures. Springer, 2009.

Digital Library

[8]

F. Chen, Y. Shan, Y. Zhang, Y. Wang, H. Franke, X. Chang, and K. Wang. Enabling FPGAs in the cloud. In Proc. CF, 2014.

Digital Library

[9]

J. G. Coutinho, O. Pell, E. OâĂ&Zacute;Neill, P. Sanders, J. McGlone, P. Grigoras, W. Luk, and C. Ragusa. HARNESS project: Managing heterogeneous computing resources for a cloud platform. In Reconfigurable Computing: Architectures, Tools, and Applications. Springer, 2014.

[10]

J. Duato, A. J. Pena, F. Silla, R. Mayo, and E. S. Quintana-Ortí. rCUDA: Reducing the number of GPU-based accelerators in high performance clusters. In Proc. HPCS, 2010.

[11]

G. Giunta, R. Montella, G. Agrillo, and G. Coviello. A GPGPU transparent virtualization component for high performance computing clouds. In Proc. Euro-Par, 2010.

Digital Library

[12]

M. Gottschlag, M. Hillenbrand, J. Kehne, J. Stoess, and F. Bellosa. LoGV: Low-overhead GPGPU virtualization. In Proc. FHC, 2013.

[13]

T. Graepel, J. Q. Candela, T. Borchert, and R. Herbrich. Web-scale bayesian click-through rate prediction for sponsored search advertising in MicrosoftâĂ&Zacute;s Bing search engine. In Proc. ICML, 2010.

[14]

P. Grigoraş, X. Niu, J. G. Coutinho, W. Luk, J. Bower, and O. Pell. Aspect driven compilation for dataflow designs. In Proc. ASAP, 2013.

Digital Library

[15]

Intel. Acquisition of altera. Intel Invester Conference Call Deck, 2015. http://bit.ly/1Q1VBqK.

[16]

A. Kawai, K. Yasuoka, K. Yoshikawa, and T. Narumi. Distributed-shared CUDA: Virtualization of large-scale GPU systems for programmability and reliability. Proc. FCTA, 2012.

[17]

P. Kegel, M. Steuwer, and S. Gorlatch. dOpenCL: Towards a uniform programming approach for distributed heterogeneous multi-/many-core systems. In Proc. HCW, 2012.

Digital Library

[18]

Maxeler Technologies. Maxeler AppGallery. http://appgallery.maxeler.com/.

[19]

R. McMillan. Microsoft supercharges Bing search with programmable chips. Wired, 2014. http://www.wired.com/2014/06/microsoft-fpga/.

[20]

M. Oikawa, A. Kawai, K. Nomura, K. Yasuoka, K. Yoshikawa, and T. Narumi. DS-CUDA: A middleware to use many GPUs in the cloud environment. In Proc. SCC, 2012.

Digital Library

[21]

D. W. Page. Dynamic data re-programmable PLA. U.S. patent US 4524430 A, 1985. http://www.google.com/patents/US4524430.

[22]

S. Parsons, D. E. Taylor, D. V. Schuehler, M. A. Franklin, and R. D. Chamberlain. High speed processing of financial information using FPGA devices. U.S. patent US7921046 B2, 2011. https://www.google.com/patents/US7921046.

[23]

S. Potluri, K. Hamidouche, A. Venkatesh, D. Bureddy, and D. Panda. Efficient inter-node MPI communication using GPUDirect RDMA for InfiniBand clusters with NVIDIA GPUs. In Proc. ICPP, 2013.

Digital Library

[24]

C. Reaño, R. Mayo, E. S. Quintana-Orti, F. Silla, J. Duato, and A. J. Peña. Influence of InfiniBand FDR on the performance of remote GPU virtualization. In Proc. IEEE Cluster, 2013.

[25]

L. Shi, H. Chen, J. Sun, and K. Li. vCUDA: GPU-accelerated high-performance computing in virtual machines. IEEE Transactions on Computers, 61(6), 2012.

Digital Library

[26]

S. Sirowy and A. Forin. WhereâĂ&Zacute;s the beef? Why FPGAs are so fast. Technical Report MSR-TR-2008-130, Microsoft Research, 2008.

[27]

Y. Suzuki, S. Kato, H. Yamada, and K. Kono. GPUvm: why not virtualizing GPUs at the hypervisor? In Proc. USENIX ATC, 2014.

Digital Library

[28]

M. Technologies. New Maxeler MPC-X series: Maximum performance computing for big data applications, 2012. http://bit.do/b9ZYX.

[29]

Turku PET Centre. libtpcmodel. http://www.turkupetcentre.net/software/libdoc/libtpcmodel/nnls_8c_source.html.

[30]

J. P. Walters, A. J. Younge, D.-I. Kang, K.-T. Yao, M. Kang, S. P. Crago, and G. C. Fox. GPU-Passthrough performance: A comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL applications. In Proc. IEEE CLOUD, 2014.

Digital Library

[31]

W. Wang, M. Bolic, and J. Parri. pvFPGA: accessing an FPGA-based hardware accelerator in a paravirtualized environment. In Proc. CODES+ISSS, 2013.

Digital Library

[32]

R. Woods, J. McAllister, Y. Yi, and G. Lightbody. FPGA-based Implementation of Signal Processing Systems. Wiley, 2008.

Digital Library

[33]

Xilinx Inc. Applications. http://www.xilinx.com/applications.html.

Cited By

Li ZHao YGao HZhou J(2024)Towards Intelligent Edge Computing: A Resource- and Reliability-Aware Hybrid Scheduling Method on Multi-FPGA SystemsElectronics10.3390/electronics1401008214:1(82)Online publication date: 27-Dec-2024
https://doi.org/10.3390/electronics14010082
Gao HLi ZZhou LLi XWang Q(2024)GLRM: Geometric Layout-Based Resource Management Method on Multiple Field Programmable Gate Array SystemsElectronics10.3390/electronics1310182113:10(1821)Online publication date: 8-May-2024
https://doi.org/10.3390/electronics13101821
Lim SJun S(2024)FlexForge: Efficient Reconfigurable Cloud Acceleration via Peripheral Resource Disaggregation2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546641(1-6)Online publication date: 25-Mar-2024
https://doi.org/10.23919/DATE58400.2024.10546641
Show More Cited By

Index Terms

High performance in the cloud with FPGA groups
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
      1. Cloud computing
    2. Other architectures
      1. Data flow architectures

Recommendations

Deploying Multi-tenant FPGAs within Linux-based Cloud Infrastructure
Cloud deployments now increasingly exploit Field-Programmable Gate Array (FPGA) accelerators as part of virtual instances. While cloud FPGAs are still essentially single-tenant, the growing demand for efficient hardware acceleration paves the way to FPGA ...
Performance Evaluation of Hypervisors for Cloud Computing

The virtualization of IT infrastructure enables consolidation and pooling of IT resources so they are shared over diverse applications to offset the limitation of shrinking resources and growing business needs. Virtualization provides a logical ...
Cloud in cloud: approaches and implementations
SIGITE '10: Proceedings of the 2010 ACM conference on Information technology education

Facilitated by the development of virtual machine (VM) technology, distributed computing and high-speed internet, cloud computing has been gradually adopted in industry and in education to deliver on-demand services and applications remotely. In this ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

UCC '16: Proceedings of the 9th International Conference on Utility and Cloud Computing

December 2016

549 pages

ISBN:9781450346160

DOI:10.1145/2996890

General Chairs:
Changjun Jiang
Tongji University, China
,
Omer Rana
Cardiff University, UK
,
Nick Antonopoulos
University of Derby, UK

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 December 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

European Commission

Conference

UCC '16

UCC '16: 9th International Conference on Utility and Cloud Computing

December 6 - 9, 2016

Shanghai, China

Acceptance Rates

Overall Acceptance Rate 38 of 125 submissions, 30%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

38
Total Citations
View Citations
239
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)2

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li ZHao YGao HZhou J(2024)Towards Intelligent Edge Computing: A Resource- and Reliability-Aware Hybrid Scheduling Method on Multi-FPGA SystemsElectronics10.3390/electronics1401008214:1(82)Online publication date: 27-Dec-2024
https://doi.org/10.3390/electronics14010082
Gao HLi ZZhou LLi XWang Q(2024)GLRM: Geometric Layout-Based Resource Management Method on Multiple Field Programmable Gate Array SystemsElectronics10.3390/electronics1310182113:10(1821)Online publication date: 8-May-2024
https://doi.org/10.3390/electronics13101821
Lim SJun S(2024)FlexForge: Efficient Reconfigurable Cloud Acceleration via Peripheral Resource Disaggregation2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546641(1-6)Online publication date: 25-Mar-2024
https://doi.org/10.23919/DATE58400.2024.10546641
Damiani AFiscaletti GBacis MBrondolin RSantambrogio M(2022)BlastFunction: A Full-stack Framework Bringing FPGA Hardware Acceleration to Cloud-native ApplicationsACM Transactions on Reconfigurable Technology and Systems10.1145/347295815:2(1-27)Online publication date: 11-Jan-2022
https://dl.acm.org/doi/10.1145/3472958
Al Qassem LStouraitis TDamiani EElfadel I(2022)FPGAaaS: A Survey of Infrastructures and SystemsIEEE Transactions on Services Computing10.1109/TSC.2020.297601215:2(1143-1156)Online publication date: 1-Mar-2022
https://doi.org/10.1109/TSC.2020.2976012
Guo JZhang LRomero Hung JLi CZhao JGuo M(2022)FPGA sharing in the cloud: a comprehensive analysisFrontiers of Computer Science10.1007/s11704-022-2127-017:5Online publication date: 24-Dec-2022
https://doi.org/10.1007/s11704-022-2127-0
Zhou PSheng JYu CWei PWang JWu DCong JShannon LAdler M(2021)MOCHAThe 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/3431920.3439304(273-279)Online publication date: 17-Feb-2021
https://dl.acm.org/doi/10.1145/3431920.3439304
Zhu ZLiu AZhang FChen F(2021)FPGA Resource Pooling in Cloud ComputingIEEE Transactions on Cloud Computing10.1109/TCC.2018.28740119:2(610-626)Online publication date: 1-Apr-2021
https://doi.org/10.1109/TCC.2018.2874011
Skhiri RFresse VMalek JSuffran BJamont J(2021)An approach for an efficient sharing of IP as a Service in cloud FPGA2021 18th International Multi-Conference on Systems, Signals & Devices (SSD)10.1109/SSD52085.2021.9429338(784-789)Online publication date: 22-Mar-2021
https://doi.org/10.1109/SSD52085.2021.9429338
Nakamura TSaito SFujimoto KKaneko MShiraga A(2021)Time-Division Multiplexing for FPGA Considering CNN Model Switch Time2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW52791.2021.00074(433-438)Online publication date: Jun-2021
https://doi.org/10.1109/IPDPSW52791.2021.00074
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten