Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3314872.3314916acmconferencesArticle/Chapter ViewAbstractPublication PagescgoConference Proceedingsconference-collections
Article

Translating CUDA to OpenCL for hardware generation using neural machine translation

Published: 16 February 2019 Publication History

Abstract

Hardware generation from high-level languages like C/C++ has been one of the dreams of software and hardware engineers for decades. Several high-level synthesis (HLS) or domain-specific languages (DSLs) have been developed to reduce the gap between high-level languages and hardware descriptive languages. However, each language tends to target some specific applications or there is a big learning curve in learning DSLs, which ends up having many program languages and tool chains. To address these challenges, we propose the use of a source-to-source translation to pick and choose which framework to use so that the hardware designer chooses the best target HLS/DSL that can be synthesized to the best performing hardware. In this work, we present source-to-source translation between CUDA to OpenCL using NMT, which we call PLNMT. The contribution of our work is that it develops techniques to generate training inputs. To generate a training dataset, we extract CUDA API usages from CUDA examples and write corresponding OpenCL API usages. With a pair of API usages acquired, we construct API usage trees that helps users find unseen usages from new samples and easily add them to a training input. Our initial results show that we can translate many applications from benchmarks such as CUDA SDK, polybench-gpu, and Rodinia. Furthermore, we show that translated kernel code from CUDA applications can be run in the OpenCL FPGA framework, which implies a new direction of HLS.

References

[1]
X. Chen, C. Liu, D. X. Song, “Tree-to-tree Neural Networks for Program Translation,” NeurIPS, 2018.
[2]
F. Zuo, X. Li, Z. Zhang, P. Young, L. Luo, Q. Zeng, “Neural Machine Translation Inspired Binary Code Similarity Comparison beyond Function Pairs,” CoRR, 2018, abs/1808.04706.
[3]
G. Martinez, M. Gardner, and W. chun Feng, “CU2CL: A CUDAto-OpenCL Translator for Multi- and Many-Core Architectures,” Proceedings of the 2011 IEEE 17th International Conference on Parallel and Distributed Systems (ICPADS), 2011, pages 300– 307.

Cited By

View all
  • (2021)Specifying and testing GPU workgroup progress modelsProceedings of the ACM on Programming Languages10.1145/34855085:OOPSLA(1-30)Online publication date: 15-Oct-2021

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CGO 2019: Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization
February 2019
286 pages
ISBN:9781728114361

Sponsors

Publisher

IEEE Press

Publication History

Published: 16 February 2019

Check for updates

Author Tags

  1. High-level Synthesis
  2. Neural Machine Translation
  3. Program Translator

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 312 of 1,061 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Specifying and testing GPU workgroup progress modelsProceedings of the ACM on Programming Languages10.1145/34855085:OOPSLA(1-30)Online publication date: 15-Oct-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media