Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Characterizing the challenges and evaluating the efficacy of a CUDA-to-OpenCL translator

Published: 01 December 2013 Publication History

Abstract

The proliferation of heterogeneous computing systems has led to increased interest in parallel architectures and their associated programming models. One of the most promising models for heterogeneous computing is the accelerator model, and one of the most cost-effective, high-performance accelerators currently available is the general-purpose, graphics processing unit (GPU). Two similar programming environments have been proposed for GPUs: CUDA and OpenCL. While there are more lines of code already written in CUDA, OpenCL is an open standard that supports a broader. Hence, there is significant interest in automatic translation from CUDA to OpenCL. The contributions of this work are three-fold: (1) an extensive characterization of the subtle challenges of translation, (2) CU2CL (CUDA to OpenCL) - an implementation of a translator, and (3) an evaluation of CU2CL with respect to coverage of CUDA, translation performance, and performance of the translated applications.

References

[1]
Daga, M., Scogland, T. and Feng, W., Architecture-Aware Mapping and Optimization on a 1600-Core GPU, in 17th IEEE International Conference on Parallel and Distributed Systems. 2011. Tainan, Taiwan, December.
[2]
S. Xiao, H. Lin, W.-C. Feng, Accelerating protein sequence search in a heterogeneous computing system, in Parallel Distributed Processing Symposium (IPDPS), 2011 IEEE International, 2011, pp. 1212-1222.
[3]
NVIDA Corporation, Nvidia CUDA C Programming Guide, http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CU DACProgrammingGuide.pdf.
[4]
NVIDIA Corporation, NVIDIA Contributes CUDA Compiler to Open Source Community, http://nvidianews.nvidia.com/Releases/NVIDIA-Contributes-CUDA-Compiler- to-Open-Source-Community-7d0.aspx, May 9, 2012.
[5]
J. Leskela, J. Nikula, M. Salmela, OpenCL Embedded Profile Prototype in Mobile Device, in IEEE Workshop on Signal Processing Systems, Oct 2009, pp. 279-284.
[6]
HPCwire, The Portland Group Ships OpenCL Compiler for Multi-core ARM, press release at http://www.hpcwire.com/hpcwire/2012-02-28/theportlandgroupshipsopen clcompilerformulti-corearm.html, Feb 28, 2012.
[7]
Imagination Technologies, Imagination Submits POWERVR SGX Cores for OpenCL Conformance, press release http://www.imgtec.com/news/Release/index.asp?NewsID=610, Feb 14, 2011.
[8]
Y. Aridor, Discussing Intel's OpenCL With Technical Lead Yariv Aridor - Parallel Programming Talk #117, video at http://software.intel.com/en-us/blogs/2011/07/27/discussing-intels-open cl-with-technical-lead-yariv-aridor-parallel-programming-talk-117/, July 27, 2011.
[9]
Altera, White Paper: Implementing FPGA Design with the OpenCL Standard, http://www.altera.com/b/opencl.html, Nov 2011.
[10]
G.F. Diamos, A.R. Kerr, S. Yalamanchili, N. Clark, Ocelot: A Dynamic Optimization Framework for Bulk-Synchronous Applications in Heterogeneous Systems, in: 19th International Conference on Parallel Architectures and Compilation, Techniques, 2010, pp. 353-364.
[11]
R. Domínguez, D. Schaa, D. Kaeli, Caracal: Dynamic Translation of Runtime Environments for GPUs, in 4th Workshop on General Purpose Processing on Graphics Processing Units, 2011, pp. 5:1-5:7.
[12]
J.A. Stratton, S.S. Stone, W.W. Hwu, Languages and Compilers for Parallel Computing.Springer-Verlag, 2008, ch. MCUDA: An Efficient Implementation of CUDA Kernels for Multi-core CPUs, pp. 16-30.
[13]
Martinez, G., Gardner, M. and Feng, W., CU2CL: A CUDA-to-OpenCL Translator for Multi- and Many-Core Architectures, in IEEE 17th Intl. 2011. Conference on Parallel and Distributed Systems, Dec.
[14]
Harvey, M.J. and Fabritiis, G.D., Swan: A tool for porting CUDA programs to OpenCL. Computer Physics Communications. v182 i4. 1093-1099.
[15]
S. Rosendahl, CUDA and OpenCL API Comparison, Presentation for T106.5800 Seminar on GPGPU Programming, Spring 2010, https://wiki.aalto.fi/download/attachments/40025977/Cuda+and+OpenCL+API +comparisonpresented.pdf.
[16]
NVIDIA, CUDA Toolkit, http://developer.nvidia.com/cuda/cuda-toolkit.
[17]
Rodinia: A Benchmark Suite for Heterogeneous Computing, http://lava.cs.virginia.edu/Rodinia.
[18]
Spinellis, D., Global analysis and transformations in preprocessed languages. IEEE Transactions on Software Engineering. v29 i11. 1019-1030.
[19]
Z. Guo, E. Zhang, X. Shen, Correctly treating synchronizations in compiling fine-grained SPMD-threaded programs for CPU, in: 2011 International Conference on Parallel Architectures and Compilation Techniques (PACT), 2011, pp. 310-319.
[20]
Quinlan, D., ROSE: Compiler Support for Object-Oriented Frameworks. Parallel Processing Letters. v2 i3. 215-226.
[21]
Eelco, V., Program Transformation with Stratego/XT, in Domain-Specific Program Generation, ser. Lecture Notes in Computer Science. v3016. 315-349.
[22]
I. Baxter, C. Pidgeon, M. Mehlich, DMS: program transformations for practical scalable software evolution, in: 26th International Conference on Software Engineering. IEEE Computer Society, 2004, pp. 625-634.
[23]
Lee, S., Johnson, T. and Eigenmann, R., Cetus-an extensible compiler infrastructure for source-to-source transformation. Languages and Compilers for Parallel Computing. v9703180. 539-553.
[24]
clang: a C language family frontend for LLVM, http://clang.llvm.org/.
[25]
J. Van Wijngaarden, J. Van Wijngaarden, E. Visser, Program Transformation Mechanics: A classification of Mechanisms for Program Transformation with a Survey of Existing Transformation Systems, Utrecht University: Information and Computing Sciences, Tech. Rep. UU-CS 2003-048, 2003.
[26]
M.L. Van De Vanter, Preserving the Documentary Structure of Source Code in Language-Based Transformation Tools, Workshop on Source Code Analysis and Manipulation, 2001, pp. 131-141.
[27]
Molecular Dynamics Simulations of Aqueous Ions at the LiquidVapor Interface Accelerated using Graphics Processors. Journal of Computational Chemistry. v32 i3. 375-385.
[28]
R. Anandakrishnan, T.R. Scogland, A.T. Fenley, J.C. Gordon, W. chun Feng, A.V. Onufriev, Accelerating electrostatic surface potential calculation with multi-scale approximation on graphics processing units, Journal of Molecular Graphics and Modelling 28 (8) (2010) 904-910.
[29]
D. Yudanov, M. Shaaban, R. Melton, L. Reznik, GPU-based simulation of spiking neural networks with real-time performance and high accuracy, in: Neural Networks (IJCNN), The 2010 International Joint Conference on, July, pp. 1-8.
[30]
Anandakrishnan, R., Scogland, T.R., Fenley, A.T., Gordon, J.C., chun Feng, W. and Onufriev, A.V., "Accelerating Electrostatic Surface Potential Calculation with Multi-Scale Approximation on Graphics Processing Units". Journal of Molecular Graphics and Modelling. v28 i8. 904-910.

Cited By

View all
  • (2019)On the Portability of CPU-Accelerated Applications via Automated Source-to-Source TranslationProceedings of the International Conference on High Performance Computing in Asia-Pacific Region10.1145/3293320.3293338(1-8)Online publication date: 14-Jan-2019
  • (2015)Bridging OpenCL and CUDAProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/2807591.2807621(1-12)Online publication date: 15-Nov-2015

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Parallel Computing
Parallel Computing  Volume 39, Issue 12
December, 2013
140 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 December 2013

Author Tags

  1. CUDA
  2. GPU
  3. OpenCL
  4. Source-to-source translation

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2019)On the Portability of CPU-Accelerated Applications via Automated Source-to-Source TranslationProceedings of the International Conference on High Performance Computing in Asia-Pacific Region10.1145/3293320.3293338(1-8)Online publication date: 14-Jan-2019
  • (2015)Bridging OpenCL and CUDAProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/2807591.2807621(1-12)Online publication date: 15-Nov-2015

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media