Abstract
This paper presents a method to derive efficient algorithms for hypercubes. The method exploits two features of the underlying hardware: a) the parallelism provided by the multiple communication links of each node and b) the possibility of overlapping computations and communications, which is a feature of machines supporting an asynchronous communication protocol. The method can be applied to a generic class of hypercube algorithms. Many examples of this class of algorithms are found in the literature for different problems. The paper shows the efficiency of the method using two of these problems as an example: FFT and Vector Add. The results show that the reduction in communication overhead is very significant in many cases and the algorithms produced by our method are always very close to the optimum in terms of execution time.
This work has been supported by the Ministry of Education and Science of Spain (CICYT TIC-92/880 and TIC-91/1036) and the European Center for Parallelism in Barcelona (CEPBA).
Chapter PDF
Similar content being viewed by others
References
Agarwal, R. C., Gustavson, F. G., Zubair, M.: An Efficient Algorithm for the 3-D FFT NAS Parallel Benchmark. Scalable High-Performance Computing Conf. (1994) 129–133
Aykanat, C., Dervis, A.: An Overlapped FFT Algorithm for Hypercube Multicomputer. ICPP (1991) III-316–III-317
Aykanat, C., Dervis, A.: Efficient Fast Hartley Transform Algorithms for Hypercube — Connected Multicomputers. IEEE Transactions on Parallel and Distributed Systems, vol. 6, no. 6(1995) 561–577
Clement, M. J., Quinn, M. J.: Overlapping Computations, Communications and I/O in Parallel Sorting. Journal of Parallel and Distributed Computing 28 (1995) 162–172
Díaz de Cerio, L., González, A., Valero-García, M.: Communication Pipelining in Hypercubes (submitted for publishing)
Díaz de Cerio, L., Valero-García, M., González, A.: Overlapping Communication and Computation in Hypercubes. DAC/UPC Research Report No. RR-96/02 (1996)
Fox, G. et al.: Solving Problems on Concurrent Processors. Englewood Cliffs, N. J. Prentice-Hall (1988)
Johnsson, S. L., Ho, C. T.: Optimum broadcasting and Personalized Communication in Hypercubes. IEEE Trans. Comput. 38 (1989) 1249–1268
Johnsson, S. L., Krawitz, R. L.: Cooley-Tukey FFT on the Connection Machine. Parallel Computing 18 (1992) 1201–1221
Lam, M.: Software Pipelining: An Effective Scheduling Technique for VLIW machines. Conf. on Programming Language Design and Implementation (1988) 318–328
Mantharam, M., Eberlein, P. J.: Block Recursive Algorithm to Generate Jacobi-sets. Parallel Computing 19 (1993) 481–496
Sahay, A.: Hiding Communication Costs in Bandwidth-Limited Parallel FFT Computation Report: UCB/CSD 93/722, University of California (1993)
Suarez A., Ojeda-Guerra, C.: Overlapping Computations and Communications in Tours Networks. 4th Euromicro Workshop on Parallel and Distributed Processing (1996) 163–169
Thomson Leighton, F.: Introduction to Parallel Algorithms and Architectures: Arrays, Trees and Hypercubes. Morgan Kaufmann Publishers (1992)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Díaz de Cerio, L., Valero-García, M., González, A. (1996). Overlapping communication and computation in hypercubes. In: Bougé, L., Fraigniaud, P., Mignotte, A., Robert, Y. (eds) Euro-Par'96 Parallel Processing. Euro-Par 1996. Lecture Notes in Computer Science, vol 1123. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61626-8_33
Download citation
DOI: https://doi.org/10.1007/3-540-61626-8_33
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61626-9
Online ISBN: 978-3-540-70633-5
eBook Packages: Springer Book Archive