Nothing Special   »   [go: up one dir, main page]

skip to main content
Partitioning and scheduling parallel programs for execution on multiprocessors
Publisher:
  • Stanford University
  • 408 Panama Mall, Suite 217
  • Stanford
  • CA
  • United States
Order Number:UMI Order No. GAX87-23080
Reflects downloads up to 13 Nov 2024Bibliometrics
Skip Abstract Section
Abstract

There are three fundamental problems to be solved in the execution of a parallel program on a multiprocessor--identifying the parallelism in the program, partitioning the program into tasks and scheduling the tasks on processors. Whereas the problem of identifying parallelism is a programming language issue, the partitioning and scheduling problems are intimately related to parameters of the target multiprocessor, like the number of processors and synchronisation and communication overhead. It is desirable for the partitioning and scheduling to be performed automatically, so that the same parallel program can execute efficiently on different multiprocessors. This dissertation presents two solutions to the partitioning and scheduling problems. The first approach is based on a macro-dataflow model, where the program is partitioned into tasks at compile-time and the tasks are scheduled on processors at run-time. The second approach is based on a compile-time scheduling model, where the partitioning of the program and the scheduling of tasks on processors are both performed at compile-time.Both approaches have been implemented to partition programs written in the single-assignment language, SISAL. The inputs to the partitioning and scheduling algorithms are a graphical representation of the program and a list of parameters describing the target multiprocessor. Execution profile information is used to derive compile-time estimates of execution times and data sizes in the program. Both the macro-dataflow and compile-time scheduling problems are expressed as optimisation problems, which are proved to be NP-complete in the strong sense. This dissertation presents efficient approximation algorithms for these problems. The effectiveness of the partitioning and scheduling algorithms is studied by multiprocessor simulations of various SISAL benchmark programs for different target multiprocessor parameters.

Cited By

  1. Renaud O, Miomandre H, Desnos K and Nezan J (2024). Automated level-based clustering of dataflow actors for controlled scheduling complexity, Journal of Systems Architecture: the EUROMICRO Journal, 154:C, Online publication date: 1-Sep-2024.
  2. ACM
    Ekaireb T, Brand L, Avaraddy N, Mock M, Krintz C and Wolski R Laminar: Dataflow Programming for Serverless IoT Applications Proceedings of the 1st Workshop on SErverless Systems, Applications and MEthodologies, (5-11)
  3. Maurya A and Tripathi A (2019). ECP, Computing, 101:8, (1015-1039), Online publication date: 1-Aug-2019.
  4. Qiu M, Gai K and Xiong Z (2018). Privacy-preserving wireless communications using bipartite matching in social big data, Future Generation Computer Systems, 87:C, (772-781), Online publication date: 1-Oct-2018.
  5. ACM
    Wu C, Wicenec A and Tobar R Partitioning SKA Dataflows for Optimal Graph Execution Proceedings of the 9th Workshop on Scientific Cloud Computing, (1-6)
  6. ACM
    Malik A and Gregg D (2015). Heuristics on Reachability Trees for Bicriteria Scheduling of Stream Graphs on Heterogeneous Multiprocessor Architectures, ACM Transactions on Embedded Computing Systems, 14:2, (1-26), Online publication date: 25-Mar-2015.
  7. ACM
    Malik A and Gregg D (2013). Orchestrating stream graphs using model checking, ACM Transactions on Architecture and Code Optimization, 10:3, (1-25), Online publication date: 16-Sep-2013.
  8. ACM
    Bathen L, Ahn Y, Pasricha S and Dutt N (2013). MultiMaKe, ACM Transactions on Embedded Computing Systems, 12:1s, (1-25), Online publication date: 1-Mar-2013.
  9. ACM
    Rădulescu A and van Gemund A On the complexity of list scheduling algorithms for distributed-memory systems Proceedings of the 13th international conference on Supercomputing, (68-75)
  10. Hunt G and Scott M The Coign automatic distributed partitioning system Proceedings of the third symposium on Operating systems design and implementation, (187-200)
  11. ACM
    Tang X and Gao G How “hard” is thread partitioning and how “bad” is a list scheduling based partitioning algorithm? Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures, (130-139)
  12. Ayed M and Gaudiot J (2019). Analysis of a Heuristic for Code Partitioning, The Journal of Supercomputing, 12:3, (191-226), Online publication date: 1-May-1998.
  13. ACM
    Prasanna G (1997). Compilation of parallel multimedia computations—extending retiming theory and Amdahl's law, ACM SIGPLAN Notices, 32:7, (180-192), Online publication date: 1-Jul-1997.
  14. ACM
    Prasanna G Compilation of parallel multimedia computations—extending retiming theory and Amdahl's law Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming, (180-192)
  15. ACM
    Tang X, Wang J, Theobald K and Gao G Thread partitioning and scheduling based on cost model Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures, (272-281)
  16. Prasanna G and Musicus B (1996). Generalized Multiprocessor Scheduling and Applications to Matrix Computations, IEEE Transactions on Parallel and Distributed Systems, 7:6, (650-664), Online publication date: 1-Jun-1996.
  17. Ge Y and Yun D Simultaneous Compression of Makespan and Number of Processors Using CRP Proceedings of the 10th International Parallel Processing Symposium, (332-338)
  18. Roh L, Najjar W and Böhm A (2019). Generation, Optimization, and Evaluation of Multithreaded Code, Journal of Parallel and Distributed Computing, 32:2, (188-204), Online publication date: 1-Feb-1996.
  19. Prasanna G and Musicus B Generalized multiprocessor scheduling for directed acylic graphs Proceedings of the 1994 ACM/IEEE conference on Supercomputing, (237-246)
  20. Roh L, Najjar W, Shankar B and Böhm A An Evaluation of Optimized Threaded Code Generation Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques, (37-46)
  21. Pande S and Psarris K Compiling Functional Parllelism on a Family of Distributed Memory Architectures Proceedings of the 1994 International Conference on Parallel Processing - Volume 01, (182-186)
  22. Srinivasa Prasanna G, Agrawal A and Musicus B (2019). Hierarchical Compilation of Macro Dataflow Graphs for Multiprocessors with Local Memory, IEEE Transactions on Parallel and Distributed Systems, 5:7, (720-736), Online publication date: 1-Jul-1994.
  23. ACM
    Halstead B, Callahan D, Dennis J, Nikhil R and Sarkar V (1994). Programming, compilation, and resource management issues for multithreading (panel session II), ACM SIGARCH Computer Architecture News, 22:1, (19-33), Online publication date: 1-Mar-1994.
  24. Sohn A A Parallel Implementation of the Traveling Salesman Problem on a Sequent Symmetry Mulitprocessor Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism, (273-280)
  25. Najjar W, Roh L and Böhm A The Initial Performance of a Bottom-Up Clustering Algorithm for Dataflow Graphs Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism, (91-100)
  26. Watts T, Soffa M and Gupta R Techniques for integrating parallelizing transformations and compiler-based scheduling methods Proceedings of the 1992 ACM/IEEE conference on Supercomputing, (830-839)
  27. Sussman A Model-driven mapping onto distributed memory parallel computers Proceedings of the 1992 ACM/IEEE conference on Supercomputing, (818-829)
  28. ACM
    Chatterjee S, Blelloch G and Fisher A (2019). Size and access inference for data-parallel programs, ACM SIGPLAN Notices, 26:6, (130-144), Online publication date: 1-Jun-1991.
  29. ACM
    Sarkar V and Gao G Optimization of array accesses by collective loop transformations Proceedings of the 5th international conference on Supercomputing, (194-205)
  30. ACM
    Chatterjee S, Blelloch G and Fisher A Size and access inference for data-parallel programs Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation, (130-144)
  31. ACM
    Sarkar V and Cann D (2019). POSC—a partitioning and optimizing SISAL compiler, ACM SIGARCH Computer Architecture News, 18:3b, (148-164), Online publication date: 1-Sep-1990.
  32. ACM
    Sarkar V and Cann D POSC—a partitioning and optimizing SISAL compiler Proceedings of the 4th international conference on Supercomputing, (148-164)
  33. ACM
    Sarkar V (2019). Determining average program execution times and their variance, ACM SIGPLAN Notices, 24:7, (298-312), Online publication date: 1-Jul-1989.
  34. ACM
    Sarkar V Determining average program execution times and their variance Proceedings of the ACM SIGPLAN 1989 conference on Programming language design and implementation, (298-312)
  35. ACM
    Burke M, Cytron R, Ferrante J, Hsieh W, Sarkar V and Shields D (1988). Automatic discovery of parallelism: a tool and an experiment (extended abstract), ACM SIGPLAN Notices, 23:9, (77-84), Online publication date: 1-Sep-1988.
  36. ACM
    Allen F, Burke M, Cytron R, Ferrante J and Hsieh W A framework for determining useful parallelism Proceedings of the 2nd international conference on Supercomputing, (207-215)
  37. ACM
    Burke M, Cytron R, Ferrante J, Hsieh W, Sarkar V and Shields D Automatic discovery of parallelism: a tool and an experiment (extended abstract) Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems, (77-84)
Contributors
  • Georgia Institute of Technology
Please enable JavaScript to view thecomments powered by Disqus.

Recommendations