Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2996890.2996895acmotherconferencesArticle/Chapter ViewAbstractPublication PagesuccConference Proceedingsconference-collections
research-article

High performance in the cloud with FPGA groups

Published: 06 December 2016 Publication History

Abstract

Field-programmable gate arrays (FPGAs) can offer invaluable computational performance for many compute-intensive algorithms. However, to justify their purchase and administration costs it is necessary to maximize resource utilization over their expected lifetime. Making FPGAs available in a cloud environment would make them attractive to new types of users and applications and help democratize this increasingly popular technology. However, there currently exists no satisfactory technique for offering FPGAs as cloud resources and sharing them between multiple tenants. We propose FPGA groups, which are seen by their clients as a single virtual FPGA, and which aggregate the computational power of multiple physical FPGAs. FPGA groups are elastic, and they may be shared among multiple tenants. We present an autoscaling algorithm to maximize FPGA groups' resource utilization and reduce user-perceived computation latencies. FPGA groups incur a low overhead in the order of 0.09 ms per submitted task. When faced with a challenging workload, the autoscaling algorithm increases resource utilization from 52% to 61% compared to a static resource allocation, while reducing task execution latencies by 61%.

References

[1]
Amazon Web Services. Amazon Machine Learning. http://aws.amazon.com/aml/.
[2]
Amazon Web Services. EC2: Elastic Compute Cloud. http://aws.amazon.com/ec2/.
[3]
J. Arram, W. Luk, and P. Jiang. Ramethy: Reconfigurable acceleration of bisulfite sequence alignment. In Proc. ACM/SIGDA Intl. Symposium on Field-Programmable Gate Arrays, 2015.
[4]
D. F. Bacon, R. Rabbah, and S. Shukla. FPGA programming for the masses. ACM Queue, 11(2), 2013.
[5]
A. Barak and A. Shiloh. The VirtualCL (VCL) cluster platform. http://www.mosix.org/txt_vcl.html.
[6]
S. Byma, J. G. Steffan, H. Bannazadeh, A. L. Garcia, and P. Chow. FPGAs in the cloud: Booting virtualized hardware accelerators with OpenStack. In Proc. FCCM, 2014.
[7]
J. M. Cardoso and P. C. Diniz. Compilation Techniques for Reconfigurable Architectures. Springer, 2009.
[8]
F. Chen, Y. Shan, Y. Zhang, Y. Wang, H. Franke, X. Chang, and K. Wang. Enabling FPGAs in the cloud. In Proc. CF, 2014.
[9]
J. G. Coutinho, O. Pell, E. OâĂŹNeill, P. Sanders, J. McGlone, P. Grigoras, W. Luk, and C. Ragusa. HARNESS project: Managing heterogeneous computing resources for a cloud platform. In Reconfigurable Computing: Architectures, Tools, and Applications. Springer, 2014.
[10]
J. Duato, A. J. Pena, F. Silla, R. Mayo, and E. S. Quintana-Ortí. rCUDA: Reducing the number of GPU-based accelerators in high performance clusters. In Proc. HPCS, 2010.
[11]
G. Giunta, R. Montella, G. Agrillo, and G. Coviello. A GPGPU transparent virtualization component for high performance computing clouds. In Proc. Euro-Par, 2010.
[12]
M. Gottschlag, M. Hillenbrand, J. Kehne, J. Stoess, and F. Bellosa. LoGV: Low-overhead GPGPU virtualization. In Proc. FHC, 2013.
[13]
T. Graepel, J. Q. Candela, T. Borchert, and R. Herbrich. Web-scale bayesian click-through rate prediction for sponsored search advertising in MicrosoftâĂŹs Bing search engine. In Proc. ICML, 2010.
[14]
P. Grigoraş, X. Niu, J. G. Coutinho, W. Luk, J. Bower, and O. Pell. Aspect driven compilation for dataflow designs. In Proc. ASAP, 2013.
[15]
Intel. Acquisition of altera. Intel Invester Conference Call Deck, 2015. http://bit.ly/1Q1VBqK.
[16]
A. Kawai, K. Yasuoka, K. Yoshikawa, and T. Narumi. Distributed-shared CUDA: Virtualization of large-scale GPU systems for programmability and reliability. Proc. FCTA, 2012.
[17]
P. Kegel, M. Steuwer, and S. Gorlatch. dOpenCL: Towards a uniform programming approach for distributed heterogeneous multi-/many-core systems. In Proc. HCW, 2012.
[18]
Maxeler Technologies. Maxeler AppGallery. http://appgallery.maxeler.com/.
[19]
R. McMillan. Microsoft supercharges Bing search with programmable chips. Wired, 2014. http://www.wired.com/2014/06/microsoft-fpga/.
[20]
M. Oikawa, A. Kawai, K. Nomura, K. Yasuoka, K. Yoshikawa, and T. Narumi. DS-CUDA: A middleware to use many GPUs in the cloud environment. In Proc. SCC, 2012.
[21]
D. W. Page. Dynamic data re-programmable PLA. U.S. patent US 4524430 A, 1985. http://www.google.com/patents/US4524430.
[22]
S. Parsons, D. E. Taylor, D. V. Schuehler, M. A. Franklin, and R. D. Chamberlain. High speed processing of financial information using FPGA devices. U.S. patent US7921046 B2, 2011. https://www.google.com/patents/US7921046.
[23]
S. Potluri, K. Hamidouche, A. Venkatesh, D. Bureddy, and D. Panda. Efficient inter-node MPI communication using GPUDirect RDMA for InfiniBand clusters with NVIDIA GPUs. In Proc. ICPP, 2013.
[24]
C. Reaño, R. Mayo, E. S. Quintana-Orti, F. Silla, J. Duato, and A. J. Peña. Influence of InfiniBand FDR on the performance of remote GPU virtualization. In Proc. IEEE Cluster, 2013.
[25]
L. Shi, H. Chen, J. Sun, and K. Li. vCUDA: GPU-accelerated high-performance computing in virtual machines. IEEE Transactions on Computers, 61(6), 2012.
[26]
S. Sirowy and A. Forin. WhereâĂŹs the beef? Why FPGAs are so fast. Technical Report MSR-TR-2008-130, Microsoft Research, 2008.
[27]
Y. Suzuki, S. Kato, H. Yamada, and K. Kono. GPUvm: why not virtualizing GPUs at the hypervisor? In Proc. USENIX ATC, 2014.
[28]
M. Technologies. New Maxeler MPC-X series: Maximum performance computing for big data applications, 2012. http://bit.do/b9ZYX.
[29]
Turku PET Centre. libtpcmodel. http://www.turkupetcentre.net/software/libdoc/libtpcmodel/nnls_8c_source.html.
[30]
J. P. Walters, A. J. Younge, D.-I. Kang, K.-T. Yao, M. Kang, S. P. Crago, and G. C. Fox. GPU-Passthrough performance: A comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL applications. In Proc. IEEE CLOUD, 2014.
[31]
W. Wang, M. Bolic, and J. Parri. pvFPGA: accessing an FPGA-based hardware accelerator in a paravirtualized environment. In Proc. CODES+ISSS, 2013.
[32]
R. Woods, J. McAllister, Y. Yi, and G. Lightbody. FPGA-based Implementation of Signal Processing Systems. Wiley, 2008.
[33]
Xilinx Inc. Applications. http://www.xilinx.com/applications.html.

Cited By

View all
  • (2024)GLRM: Geometric Layout-Based Resource Management Method on Multiple Field Programmable Gate Array SystemsElectronics10.3390/electronics1310182113:10(1821)Online publication date: 8-May-2024
  • (2024)FlexForge: Efficient Reconfigurable Cloud Acceleration via Peripheral Resource Disaggregation2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546641(1-6)Online publication date: 25-Mar-2024
  • (2022)BlastFunction: A Full-stack Framework Bringing FPGA Hardware Acceleration to Cloud-native ApplicationsACM Transactions on Reconfigurable Technology and Systems10.1145/347295815:2(1-27)Online publication date: 11-Jan-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
UCC '16: Proceedings of the 9th International Conference on Utility and Cloud Computing
December 2016
549 pages
ISBN:9781450346160
DOI:10.1145/2996890
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 December 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. FPGA
  2. cloud computing
  3. virtualization

Qualifiers

  • Research-article

Funding Sources

Conference

UCC '16

Acceptance Rates

Overall Acceptance Rate 38 of 125 submissions, 30%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)1
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)GLRM: Geometric Layout-Based Resource Management Method on Multiple Field Programmable Gate Array SystemsElectronics10.3390/electronics1310182113:10(1821)Online publication date: 8-May-2024
  • (2024)FlexForge: Efficient Reconfigurable Cloud Acceleration via Peripheral Resource Disaggregation2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546641(1-6)Online publication date: 25-Mar-2024
  • (2022)BlastFunction: A Full-stack Framework Bringing FPGA Hardware Acceleration to Cloud-native ApplicationsACM Transactions on Reconfigurable Technology and Systems10.1145/347295815:2(1-27)Online publication date: 11-Jan-2022
  • (2022)FPGAaaS: A Survey of Infrastructures and SystemsIEEE Transactions on Services Computing10.1109/TSC.2020.297601215:2(1143-1156)Online publication date: 1-Mar-2022
  • (2022)FPGA sharing in the cloud: a comprehensive analysisFrontiers of Computer Science10.1007/s11704-022-2127-017:5Online publication date: 24-Dec-2022
  • (2021)MOCHAThe 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/3431920.3439304(273-279)Online publication date: 17-Feb-2021
  • (2021)FPGA Resource Pooling in Cloud ComputingIEEE Transactions on Cloud Computing10.1109/TCC.2018.28740119:2(610-626)Online publication date: 1-Apr-2021
  • (2021)An approach for an efficient sharing of IP as a Service in cloud FPGA2021 18th International Multi-Conference on Systems, Signals & Devices (SSD)10.1109/SSD52085.2021.9429338(784-789)Online publication date: 22-Mar-2021
  • (2021)Time-Division Multiplexing for FPGA Considering CNN Model Switch Time2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW52791.2021.00074(433-438)Online publication date: Jun-2021
  • (2021)A Case for Function-as-a-Service with Disaggregated FPGAs2021 IEEE 14th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD53861.2021.00047(333-344)Online publication date: Sep-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media