Abstract
In this paper we study a parallel form of the SOR method for the numerical solution of the Convection Diffusion equation suitable for GPUs using CUDA. To exploit the parallelism offered by GPUs we consider the fine grain parallelism model. This is achieved by considering the local relaxation version of SOR. More specifically, we use SOR with red black ordering with two sets of parameters ω ij and \(\omega_{ij}^{'}\). The parameter ω ij is associated with each red (i+j even) grid point (ij), whereas the parameter \(\omega_{ij}^{'}\) is associated with each black (i+j odd) grid point (ij). The use of a parameter for each grid point avoids the global communication required in the adaptive determination of the best value of ω and also increases the convergence rate of the SOR method [3]. We present our strategy and the results of our effort to exploit the computational capabilities of GPUs under the CUDA environment. Additionally, a program for the CPU was developed as a performance reference. Significant performance improvement was achieved with the three developed GPU kernel variations which proved to have different pros and cons.
Chapter PDF
Similar content being viewed by others
References
Adams, L.M., Leveque, R.J., Young, D.: Analysis of the SOR iteration for the 9-point Laplacian. SIAM J. Num. Anal. 9, 1156–1180 (1988)
Botta, E.F., Veldman, A.E.P.: On local relaxation methods and their application to convection-diffusion equations. J. Comput. Phys. 48, 127–149 (1981)
Boukas, L.A., Missirlis, N.M.: The Parallel Local Modified SOR for Nonsymmetric Linear Systems. Intern. J. Computer Math. 68, 153–174 (1998)
Ehrlich, L.W.: An Ad-Hoc SOR Method. J. Comput. Phys. 42, 31–45 (1981)
Ehrlich, L.W.: The Ad-Hoc SOR method: A local relaxation scheme, in elliptic Problem Solvers II, pp. 257–269. Academic Press, New York (1984)
Ha, L., Króger, J., Joshi, S., Silva, C.T.: Multiscale Unbiased Diffeomorphic Atlas Construction on Multi-GPUs. GPU Computing Gems. Emerald Edition, pp. 771–791. Morgan Kaufmann (2011)
Hageman, L.A., Young, D.M.: Applied Iterative Methods. Academic Press, New York (1981)
Kirk, D.B., Hwu, W.W.: Programming Massively Parallel Processors. Morgan Kaufmann (2009)
Komatsu, K., Soga, T., Egawa, R., Takizawa, H., Kobayashi, H., Takahashi, S., Sasaki, D., Nakahashi, K.: Parallel Processing of the Building-Cube Method on the GPU Platform. In: Computers & Fluids Special Issue “22nd International Conference on Parallel Computational Fluid Dynamics”, vol. 45(1), pp. 122–128 (2011)
Konstantinidis, E., Cotronis, Y.: Accelerating the Red/Black SOR Method Using GPUs with CUDA. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds.) PPAM 2011, Part I. LNCS, vol. 7203, pp. 589–598. Springer, Heidelberg (2012)
Kuo, C.-C.J., Levy, B.C., Musicus, B.R.: A local relaxation method for solving elliptic PDE’s on mesh-connected arrays. SIAM J. Sci. Statist. Comput. 8, 530–573 (1987)
Nickolls, J., Buck, I., Garland, M., Skadron, K.: Scalable Parallel Programming with CUDA. In: ACM SIGGRAPH 2008 Classes, vol. 16, pp. 1–14 (2008)
Ortega, J.M., Voight, R.G.: Solution of Partial Differential Equations on Vector and Parallel Computers. SIAM, Philadelphia (1985)
Varga, R.S.: Matrix Iterative Analysis. Prentice-Hall, Englewood (1962)
Young, D.M.: Iterative Solution of Large Linear Systems. Academic Press, New York (1971)
NVidia CUDA Reference Manual v. 4.0, NVidia (2011)
NVidia CUDA C Best Practices Guide Version 4.0, NVidia (2011)
Tuning CUDA Applications for Fermi, NVidia (2011)
Tesla C2050 And Tesla C2070 Computing Processor Board, NVidia (2011)
The OpenCL Specification, Khronos group (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cotronis, Y., Konstantinidis, E., Louka, M.A., Missirlis, N.M. (2012). Parallel SOR for Solving the Convection Diffusion Equation Using GPUs with CUDA. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds) Euro-Par 2012 Parallel Processing. Euro-Par 2012. Lecture Notes in Computer Science, vol 7484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32820-6_57
Download citation
DOI: https://doi.org/10.1007/978-3-642-32820-6_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32819-0
Online ISBN: 978-3-642-32820-6
eBook Packages: Computer ScienceComputer Science (R0)