Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3543622.3573047acmconferencesArticle/Chapter ViewAbstractPublication PagesfpgaConference Proceedingsconference-collections
short-paper

Exploring the Versal AI Engines for Accelerating Stencil-based Atmospheric Advection Simulation

Published: 12 February 2023 Publication History

Abstract

AMD Xilinx's new Versal Adaptive Compute Acceleration Platform (ACAP) is an FPGA architecture combining reconfigurable fabric with other on-chip hardened compute resources. AI engines are one of these and, by operating in a highly vectorized manner, they provide significant raw compute that is potentially beneficial for a range of workloads including HPC simulation. However, this technology is still early-on, and as yet unproven for accelerating HPC codes, with a lack of benchmarking and best practice.
This paper presents an experience report, exploring porting of the Piacsek and Williams (PW) advection scheme onto the Versal ACAP, using the chip's AI engines to accelerate the compute. A stencil-based algorithm, advection is commonplace in atmospheric modelling, including several Met Office codes who initially developed this scheme. Using this algorithm as a vehicle, we explore optimal approaches for structuring AI engine compute kernels and how best to interface the AI engines with programmable logic. Evaluating performance using a VCK5000 against non-AI engine FPGA configurations on the VCK5000 and Alveo U280, as well as a 24-core Xeon Platinum Cascade Lake CPU and Nvidia V100 GPU, we found that whilst the number of channels between the fabric and AI engines are a limitation, by leveraging the ACAP we can double performance compared to an Alveo U280.

References

[1]
Sagheer Ahmad, Sridhar Subramanian, Vamsi Boppana, Shankar Lakka, Fu-Hing Ho, Tomai Knopp, Juanjo Noguera, Gaurav Singh, and Ralph Wittig. 2019. Xilinx first 7nm device: Versal AI core (VC1902). In 2019 IEEE Hot Chips 31 Symposium (HCS). IEEE Computer Society, 1--28.
[2]
Nick Brown. 2021. Accelerating advection for atmospheric modelling on Xilinx and Intel FPGAs. In 2021 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 767--774.
[3]
Nick Brown et al. 2015. A highly scalable Met Office NERC Cloud model. In Proceedings of the 3rd International Conference on Exascale Applications and Software. University of Edinburgh, 132--137.
[4]
Brian Gaide, Dinesh Gaitonde, Chirag Ravishankar, and Trevor Bauer. 2019. Xilinx adaptive compute acceleration platform: VersalTM architecture. In Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 84--93.
[5]
David Lee, Gregory Allen, Matthew Cannon, Hunter Earnest, Paul Thelen, Nathaniel Dodds, Jeffrey McCasland, and Carol Chen. 2021. Preliminary Results from Heavy-Ion Irradiation of the Xilinx Versal ACAP. Technical Report. Sandia National Lab.(SNL-NM), Albuquerque, NM (United States).
[6]
Steve A Piacsek and Gareth P Williams. 1970. Conservation properties of convection difference schemes. J. Comput. Phys., Vol. 6, 3 (1970), 392--405.
[7]
Xilinx. 2021. Versal ACAP AI Engine Architecture Manual (AM009). https://docs.xilinx.com/r/en-US/am009-versal-ai-engine
[8]
Xilinx. 2022a. AI Engine Kernel Coding Best Practices Guide (UG1079)). https://docs.xilinx.com/r/en-US/ug1079-ai-engine-kernel-coding
[9]
Xilinx. 2022b. Versal ACAP AI Engine Programming Environment User Guide (UG1076). https://docs.xilinx.com/r/en-US/ug1076-ai-engine-environment
[10]
Chengming Zhang, Tong Geng, Anqi Guo, Jiannan Tian, Martin Herbordt, Ang Li, and Dingwen Tao. 2022. H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture. arXiv preprint arXiv:2206.13734 (2022).

Cited By

View all
  • (2024)TaPaS Co-AIE: An Open-Source Framework for Streaming-Based Heterogeneous Acceleration Using AMD AI Engines2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00041(155-161)Online publication date: 27-May-2024
  • (2024)Efficient Approaches for GEMM Acceleration on Leading AI-Optimized FPGAs2024 IEEE 32nd Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)10.1109/FCCM60383.2024.00015(54-65)Online publication date: 5-May-2024
  • (2023)MaxEVA: Maximizing the Efficiency of Matrix Multiplication on Versal AI Engine2023 International Conference on Field Programmable Technology (ICFPT)10.1109/ICFPT59805.2023.00016(96-105)Online publication date: 12-Dec-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
FPGA '23: Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays
February 2023
283 pages
ISBN:9781450394178
DOI:10.1145/3543622
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 February 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. AI engines
  2. FPGAs
  3. HPC
  4. VCK5000
  5. atmospheric advection
  6. stencil based algorithms
  7. versal ACAP

Qualifiers

  • Short-paper

Funding Sources

  • EPSRC
  • STFC

Conference

FPGA '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 125 of 627 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)164
  • Downloads (Last 6 weeks)25
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)TaPaS Co-AIE: An Open-Source Framework for Streaming-Based Heterogeneous Acceleration Using AMD AI Engines2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00041(155-161)Online publication date: 27-May-2024
  • (2024)Efficient Approaches for GEMM Acceleration on Leading AI-Optimized FPGAs2024 IEEE 32nd Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)10.1109/FCCM60383.2024.00015(54-65)Online publication date: 5-May-2024
  • (2023)MaxEVA: Maximizing the Efficiency of Matrix Multiplication on Versal AI Engine2023 International Conference on Field Programmable Technology (ICFPT)10.1109/ICFPT59805.2023.00016(96-105)Online publication date: 12-Dec-2023
  • (2023)AIM: Accelerating Arbitrary-Precision Integer Multiplication on Heterogeneous Reconfigurable Computing Platform Versal ACAP2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)10.1109/ICCAD57390.2023.10323754(1-9)Online publication date: 28-Oct-2023
  • (undefined)EA4RCA: Efficient AIE accelerator design framework for regular Communication-Avoiding AlgorithmACM Transactions on Architecture and Code Optimization10.1145/3678010

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media