article

Simulation of bevel gear cutting with GPGPUs--performance and productivity

Authors:

Dmytro Plotnikov,

Christian Bischof,

Ario Hardjosuwito,

Christof Gorgels,

Christian BrecherAuthors Info & Claims

Computer Science - Research and Development, Volume 26, Issue 3-4

Pages 165 - 174

https://doi.org/10.1007/s00450-011-0158-0

Published: 01 June 2011 Publication History

Abstract

The desire for general purpose computation on graphics processing units caused the advance of new programming paradigms, e.g. OpenCL C/C++, CUDA C or the PGI Accelerator Model. In this paper, we apply these programming approaches to the software KegelSpan for simulating bevel gear cutting. This engineering application simulates an important manufacturing process in the automotive industry. The results obtained are compared to an OpenMP implementation on various hardware configurations. The discussion covers performance results, but also productivity of code development realized in this effort.

References

[1]

BMW AG, Klingelnberg GmbH, ZF Friedrichshafen AG: Application and manufacturing.

[2]

Bordawekar R, Bondhugula U, Rao R (2010) Can CPUs match GPUs on performance with productivity?: experiences with optimizing a FLOP-intensive application on CPUs and GPU. Tech rep, IBM Res Division.

[3]

Brecher C, Klocke F, Schröder T, Rütjes U (2008) Analysis and simulation of different manufacturing processes for bevel gear cutting. J Adv Mech Des Syst Manuf 2(1):165-172.

[4]

Brecher C, Gorgels C, Hardjosuwito A (2010) Simulation based tool wear analysis in bevel gear cutting. In: International conference on gears, VDI-Berichte, vol 2108.2. VDI Verlag, Düsseldorf, pp 1381-1384.

[5]

Che S, Boyer M, Meng J, Tarjan D, Sheaffer J, Skadron K (2008) A performance study of general-purpose applications on graphics processors using CUDA. J Parallel Distrib Comput 68(10):1370-1380.

Digital Library

[6]

Gharaibeh A, Ripeanu M (2010) Size matters: space/time trade-offs to improve GPGPU applications performance. In: Proceedings of the SC'10. IEEE Computer Society, Washington, pp 1-12.

[7]

Griebel M, Zaspel P (2010) A multi-GPU accelerated solver for the three-dimensional two-phase incompressible Navier-Stokes equations. Comput Sci--R & D 25(1):65-73.

[8]

Hacker H, Trinitis C, Weidendorfer J, Brehm M (2011) Considering GPGPU for HPC centers: is it worth the effort? In: Keller R, Kramer D, Weiss JP (eds) Facing the multicore-challenge. LNCS, vol 6310. Springer, Berlin, pp 118-130.

[9]

Kapinos P, an Mey D (2009) Parallel simulation of bevel gear cutting processes with OpenMP tasks. In: Müller M, de Supinski B, Chapman B (eds) Evolving OpenMP in an age of extreme parallelism. LNCS, vol 5568. Springer, Berlin, pp 1-14.

[10]

Karimi K, Dickson NG, Hamze F (2010) A performance comparison of CUDA and OpenCL. CoRR 1005.2581.

[11]

Khronos OpenCL Working Group (2009) The OpenCL specification, version 1.0.48.

[12]

Kirk DB, Hwu WW (2010) Programming massively parallel processors: a hands-on approach, 1st edn. Morgan Kaufmann, San Mateo.

[13]

Klocke F, Gorgels C, Herzhoff S, Hardjosuwito A (2010) Simulation of bevel gear cutting. In: 3rd WZL gear conference. KAPP NILES, Boulder.

[14]

Komatsu K, Sato K, Arai Y, Koyama K, Takizawa H, Kobayashi H (2010) Evaluating performance and portability of OpenCL programs. In: The fifth international workshop on automatic performance tuning.

[15]

Loh E (2010) The ideal HPC programming language. Commun ACM 53:42-47.

Digital Library

[16]

NVIDIA (2010) CUDA C programming guide, v3.2.

[17]

NVIDIA (2010) OpenCL best practices guide.

[18]

OpenMP Architecture Review Board (2008) OpenMP application program interface, version 3.0.

[19]

Pennycook SJ, Hammond SD, Jarvis SA, Mudalige GR (2010) Performance analysis of a hybrid MPI/CUDA implementation of the NAS-LU benchmark. PMBS 10, in conjunction with SC'10, New Orleans, LA, USA.

[20]

Sanders J, Kandrot E (2010) CUDA by example: an introduction to general-purpose GPU programming, 1st edn. Addison-Wesley, Reading.

[21]

The Portland Group (2010) PGI Fortran & C accelerator programming model, version 1.2.

[22]

Weber T (2009) Optimierung der Rechenzeit bei der Spanungs-dickenberechnung für das Kegelradfräsen mittels Grafikkarten. Master's thesis, Aachen University of Applied Sciences.

Cited By

Wienke SMiller JSchulz MMüller MWest J(2016)Development effort estimation in HPCProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.5555/3014904.3014918(1-12)Online publication date: 13-Nov-2016
https://dl.acm.org/doi/10.5555/3014904.3014918
Arroyo MCouder-Castañeda CTrujillo-Alcantara AHerrera-Diaz IVera-Chavez N(2015)A performance study of a dual Xeon-Phi cluster for the forward modelling of gravitational fieldsScientific Programming10.1155/2015/3160122015(15-15)Online publication date: 1-Jan-2015
https://dl.acm.org/doi/10.1155/2015/316012
Schmidl DCramer TWienke STerboven CMüller M(2013)Assessing the performance of OpenMP programs on the intel xeon phiProceedings of the 19th international conference on Parallel Processing10.1007/978-3-642-40047-6_56(547-558)Online publication date: 26-Aug-2013
https://dl.acm.org/doi/10.1007/978-3-642-40047-6_56
Show More Cited By

Simulation of bevel gear cutting with GPGPUs--performance and productivity
1. General and reference
  1. Cross-computing tools and techniques

Recommendations

Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: Programming Productivity, Performance, and Energy Consumption
ARMS-CC '17: Proceedings of the 2017 Workshop on Adaptive Resource Management and Scheduling for Cloud Computing

Many modern parallel computing systems are heterogeneous at their node level. Such nodes may comprise general purpose CPUs and accelerators (such as, GPU, or Intel Xeon Phi) that provide high performance with suitable energy-consumption characteristics. ...
accULL: An User-directed Approach to Heterogeneous Programming
ISPA '12: Proceedings of the 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications

The world of HPC is undergoing rapid changes and computer architectures capable to achieve high performance have broadened. The irruption in the scene of computational accelerators, like GPUs, is increasing performance while maintaining low cost per ...
Directive-based Programming for GPUs: A Comparative Study
HPCC '12: Proceedings of the 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems

GPUs and other accelerators are available on many different devices, while GPGPU has been massively adopted by the HPC research community. Although a plethora of libraries and applications providing GPU support are available, the need of implementing ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Computer Science - Research and Development

Computer Science - Research and Development Volume 26, Issue 3-4

June 2011

186 pages

ISSN:1865-2034

Issue’s Table of Contents

Copyright © Copyright © 2011 Springer-Verlag.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 June 2011

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wienke SMiller JSchulz MMüller MWest J(2016)Development effort estimation in HPCProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.5555/3014904.3014918(1-12)Online publication date: 13-Nov-2016
https://dl.acm.org/doi/10.5555/3014904.3014918
Arroyo MCouder-Castañeda CTrujillo-Alcantara AHerrera-Diaz IVera-Chavez N(2015)A performance study of a dual Xeon-Phi cluster for the forward modelling of gravitational fieldsScientific Programming10.1155/2015/3160122015(15-15)Online publication date: 1-Jan-2015
https://dl.acm.org/doi/10.1155/2015/316012
Schmidl DCramer TWienke STerboven CMüller M(2013)Assessing the performance of OpenMP programs on the intel xeon phiProceedings of the 19th international conference on Parallel Processing10.1007/978-3-642-40047-6_56(547-558)Online publication date: 26-Aug-2013
https://dl.acm.org/doi/10.1007/978-3-642-40047-6_56
Wienke SSpringer PTerboven Can Mey D(2012)OpenACCProceedings of the 18th international conference on Parallel Processing10.1007/978-3-642-32820-6_85(859-870)Online publication date: 27-Aug-2012
https://dl.acm.org/doi/10.1007/978-3-642-32820-6_85

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents