article

Extracting and predicting the communication behaviour of parallel applications

Authors:

Rodrigo Fernandes de Mello,

Evgueni Dodonov,

Ricardo Bertagna,

Luciano Jose SengerAuthors Info & Claims

International Journal of Parallel, Emergent and Distributed Systems, Volume 24, Issue 3

Pages 225 - 242

https://doi.org/10.1080/17445760802387155

Published: 01 June 2009 Publication History

Abstract

This paper presents a model to extract and predict the communication behaviour of parallel applications. The behaviour was extracted by introducing system calls in the Linux kernel to obtain the communication information about application tasks. The extracted information is organised as time series of the number of bytes transmitted and received during the task's execution time. The dimension of these time series is reduced by using a self-organising neural network architecture that detects common resource usage states and compacts communication events. This reduction simplifies the design of the prediction model as it does not need to consider too many different communication characteristics. The reduced information is submitted to a time-delay neural network that allows to predict the volume of future data transfers. The resulting predictions may be used in scheduling algorithms, allowing to define the best resources to be allocated according to communication events. If there is no communication it is possible to distribute processes just considering CPU capacity, otherwise it is necessary to evaluate when and how many bytes are transferred to allocate tasks in neighbour networks.

References

[1]

Amir, Y., Awerbuch, B., Barak, A., Borgstrom, R. S. and Keren, A. (2003) An opportunity cost approach for job assignment in a scalable computing cluster. IEEE Trans. Parallel Distribut. Syst., 14:1, pp. 39-50.

[2]

Anderson, R. J. and Setubal, J. C. (1992) On the parallel implementation of Goldberg's maximum flow algorithm. Proceedings of the Fourth Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 168-177. ACM Press, New York, NY

Digital Library

[3]

Arpaci-Dusseau, A. C., Culler, D. E. and Mainwaring, M. (1998) Scheduling with implicit information in distributed systems. Proceedings of ACM SIGMETRICS'98, pp. 233-248.

Digital Library

[4]

Bailey, D. H., Barszcz, E., Barton, J. T., Browning, D. S., Carter, R. L., Dagum, D., Fatoohi, R. A., Frederickson, P. O., Lasinski, T. A. Schreiber, R. S. et al. (1991) The NAS parallel benchmarks. Int. J. Supercomput. Appl., 5:3, pp. 63-73.

Digital Library

[5]

Brecht, T. and Guha, K. (1996) Using parallel program characteristics in dynamic processor allocation policies. Perform. Eval., 28:4, pp. 519-539.

[6]

Carpenter, G. A., Gjaja, M. N., Gopal, S. and Woodcock, C. E. (1997) ART neural networks for remote sensing: Vegetation classification from lansat TM and terrain data. IEEE Trans. Geosci. Remote Sensing, 35:2

[7]

Carpenter, G. A. and Grossberg, S. (1988) The ART of adaptive pattern recognition by a self-organizing neural network. Computer, 21:3, pp. 77-88.

Digital Library

[8]

Grossberg, S. (1989) ART 2: Self-organization of stable category recognition codes for analog input patterns. Appl. Opt., 26:23, pp. 4919-4930.

[9]

Carpenter, G. A., Grossberg, S. and Rosen, D. B. (1991) ART 2-A: An adaptive resonance algorithm for rapid category learning and recognition. Neural Netw., 4, pp. 494-504.

Digital Library

[10]

Chodnekar, S. (1997) Towards a communication characterization methodology for parallel applications. Proceedings of the Third IEEE Symposium on High-Performance Computer Architecture (HPCA '97), p. 310. IEEE Computer Society, Washington, DC

[11]

Coulouris, G., Dollimore, J. and Kindberg, T. (1994) Distributed Systems: Concepts and Design, Addison Wisley, Menlo Park, CA

[12]

Culler, D. E., Karp, R. M., Patterson, D. A., Sahay, A., Schauser, K. E., Santos, E., Subramonian, R. and von Eicken, T. (1993) Log P: Towards a realistic model of parallel computation. Proceedings of the Fourth Symposium on the Principles Practice of Parallel Programming, pp. 1-12.

[13]

Devarakonda, M. V. and Iyer, R. K. (1989) Predictability of process resource usage: A measurement-based study on UNIX. IEEE Trans. Software Eng., 15:12, pp. 1579-1586.

Digital Library

[14]

Fatoohi, R. A. (1989) Multitasking a Navier-Stokes algorithm on the CRAY-2. J. Supercomput., 3:2, pp. 109-124.

[15]

Feitelson, D. G., Rudolph, L., Schwiegelshohn, U., Sevcik, K. C. and Wong, P. (1997) Theory and practice in parallel job scheduling. Job Scheduling Strategies for Parallel Processing, LNCS. pp. 1-34. Springer-Verlag, Berlin

[16]

Filippidis, A., Jain, L. C. and Lozo, P. (1999) Degree of familiarity ART2 in knowledge-based landmine detection. IEEE Trans. Neural Netw., 10:1

Digital Library

[17]

Gan, K. and Lua, K. (1992) Chinese character classification using adaptive resonance network. Pattern Recognition, 25:8, pp. 877-888.

[18]

Gibbons, R. (1997) A historical application profiler for use by parallel schedulers. Job Scheduling Strategies for Parallel Processing, LNCS. pp. 58-77.

[19]

Harchol-Balter, M. and Downey, A. B. (1997) Exploiting process lifetimes distributions for dynamic load balancing. ACM Trans. Comput. Syst., 15:3, pp. 253-285.

Digital Library

[20]

He, J., Tan, A. -H. and Tan, C. -L. (2003) Modified art 2a growing network capable of generating a fixed number of nodes. IEEE Trans. Neural Netw., 3:15, pp. 728-737.

[21]

Keyvan, S. and Rabelo, L. C. (1992) Sensor signal analysis by neural networks for surveillance in nuclear reactors. IEEE Trans. Nucl. Sci., 39:2

[22]

Krishnaswamy, S., Zaslavsky, A. and Loke, S. W. (2004) Estimating computation times to support scheduling of data intensive applications. IEEE Distribut. Syst., 5:4

[23]

Mello, R., Senger, L. and Yang, L. (2005) Automatic text classification using an artificial neural network. High Perform. Comput. Sci. Eng., 1, pp. 1-21.

[24]

Naik, V. K. (1995) A scalable implementation of the NAS parallel benchmark bt on distributed memory systems. IBM Syst. J., 34:2, pp. 273-291.

Digital Library

[25]

Naik, V. K., Setia, S. K. and Squillante, M. S. (1997) Processor allocation in multiprogrammed distributed-memory parallel computer systems. J. Parallel Distrib. Comput., 47:1, pp. 28-47.

Digital Library

[26]

Pappas, T. (1989) The Joy of Mathematics, Wide World Publishing, San Carlos, CA

[27]

Senger, L. J., Mello, R. F., Santana, M. J. and Santana, R. H. C. (2005) An on-line approach for classifying and extracting application behavior on Linux, High Performance Computing: Paradigm and Infrastructure. John Wiley and Sons Inc., New York, NY

[28]

Senger, L. J., Santana, M. J. and Santana, R. H. C. (2004) Using runtime measurements and historical traces for acquiring knowledge in parallel applications. International Conference on Computational Science (ICCS'2004), LNCS. pp. 661-665. Springer, Berlin

[29]

Santana, R. H. C. (2004) Using runtime measurements and historical traces for acquiring knowledge in parallel applications. International Conference on Computational Science (ICCS'2004), LNCS. pp. 661-665.

[30]

Sevcik, K. C. (1989) Characterizations of parallelism in applications and their use in scheduling. Perform. Evaluat. Rev., 17:1, pp. 171-180.

Digital Library

[31]

Silva, F. A. B. D. and Scherson, I. D. (2000) Improving parallel job scheduling using runtime measurements. Job Scheduling Strategies for Parallel Processing, LNCS. pp. 18-38.

[32]

Sivasubramaniam, A., Singla, A., Ramachandran, U. and Venkateswaran, H. (1994) An approach to scalability study of shared memory parallel systems. Meas. Model. Comput. Syst., pp. 171-180.

[33]

Smith, W., Foster, I. T. and Taylor, V. E. (1998) Predicting application run times using historical information. JSSPP, pp. 122-142.

Digital Library

[34]

Vetter, J. S. and Mueller, F. (2003) Communication characteristics of large-scale scientific applications for contemporary cluster architectures. J. Parallel Distrib. Comput., 63:9, pp. 853-865.

Digital Library

[35]

Vlajic, N. and Card, H. C. (2001) Vector quantization of images using modified adaptive resonance algorithm for hierarchical clustering. IEEE Trans. Neural Netw., 12:5, pp. 1147-1162.

Digital Library

[36]

Waibel, A., Hanazawa, T., Hinton, G., Shikano, K. and Lang, K. (1989) Phoneme recognition using time delay neural networks. IEEE Trans. Accoust. Speech Signal Process., 37, pp. 328-339.

[37]

Wang, Y., Kim, S. and Principe, J. C. (2005) Comparison of TDNN training algorithms in brain machine interfaces. Proc. IEEE Int. Joint Conf. Neural Netw., 4, pp. 2459-2462.

[38]

Whiteley, J. R. and Davis, J. F. (1993) Qualitative interpretation of sensor patterns. IEEE Expert, 8, pp. 54-63.

Digital Library

[39]

Davis, J. F. (1996) Observations and problems applying ART2 for dynamic sensor pattern interpretation. IEEE Trans. Syst. Man Cyber. - Part A: Syst. Human., 26:4, pp. 423-437.

Digital Library

Recommendations

Entropy Based Evaluation of Communication Predictability in Parallel Applications

The performance of parallel computing applications is highly dependent on the efficiency of the underlying communication operations. While often characterized as dynamic, these communication operations frequently exhibit spatial and temporal locality as ...
Toward Abstracting the Communication Intent in Applications to Improve Portability and Productivity
IPDPSW '13: Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum

Programming with communication libraries such as the Message Passing Interface (MPI) obscures the high-level intent of the communication in an application and makes static communication analysis difficult to do. Compilers are unaware of communication ...
A performance and portability study of parallel applications using a distributed computing testbed
HCW '97: Proceedings of the 6th Heterogeneous Computing Workshop (HCW '97)

A case study was conducted to examine the performance and portability of parallel applications, with an emphasis on data transfer among the processors in heterogeneous environments. Several parallel test programs using MPICH, a message passing interface ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image International Journal of Parallel, Emergent and Distributed Systems

International Journal of Parallel, Emergent and Distributed Systems Volume 24, Issue 3

June 2009

82 pages

ISSN:1744-5760

EISSN:1744-5779

Issue’s Table of Contents

Publisher

Taylor & Francis, Inc.

United States

Publication History

Published: 01 June 2009

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents