Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1993498.1993502acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

Parallelism orchestration using DoPE: the degree of parallelism executive

Published: 04 June 2011 Publication History

Abstract

In writing parallel programs, programmers expose parallelism and optimize it to meet a particular performance goal on a single platform under an assumed set of workload characteristics. In the field, changing workload characteristics, new parallel platforms, and deployments with different performance goals make the programmer's development-time choices suboptimal. To address this problem, this paper presents the Degree of Parallelism Executive (DoPE), an API and run-time system that separates the concern of exposing parallelism from that of optimizing it. Using the DoPE API, the application developer expresses parallelism options. During program execution, DoPE's run-time system uses this information to dynamically optimize the parallelism options in response to the facts on the ground. We easily port several emerging parallel applications to DoPE's API and demonstrate the DoPE run-time system's effectiveness in dynamically optimizing the parallelism for a variety of performance goals.

References

[1]
R. Allen and K. Kennedy. Optimizing compilers for modern architectures: A dependence-based approach. Morgan Kaufmann Publishers Inc., 2002.
[2]
APC metered rack PDU user's guide. http://www.apc.com
[3]
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: characterization and architectural implications. In Proceedings of the Seventeenth International Conference on Parallel Architecture and Compilation Techniques (PACT), 2008.
[4]
F. Blagojevic, D. S. Nikolopoulos, A. Stamatakis, C. D. Antonopoulos, and M. Curtis-Maury. Runtime scheduling of dynamic parallelism on accelerator-based multi-core systems. Parallel Computing, 2007.
[5]
R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: An efficient multithreaded runtime system. In Proceedings of the 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), 1995.
[6]
M. Bridges, N. Vachharajani, Y. Zhang, T. Jablin, and D. August. Revisiting the sequential programming model for multi-core. In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2007.
[7]
C. B. Colohan, A. Ailamaki, J. G. Steffan, and T. C. Mowry. Optimistic intra-transaction parallelism on chip multiprocessors. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB), 2005.
[8]
M. Curtis-Maury, J. Dzierwa, C. D. Antonopoulos, and D. S. Nikolopoulos. Online power-performance adaptation of multithreaded programs using hardware event-based prediction. In Proceedings of the 20th International Conference on Supercomputing (ICS), 2006.
[9]
Y. Ding, M. Kandemir, P. Raghavan, and M. J. Irwin. Adapting application execution in CMPs using helper threads. Journal of Parallel and Distributed Computing, 2009.
[10]
GNU Image Manipulation Program. http://www.gimp.org
[11]
M. W. Hall and M. Martonosi. Adaptive parallelism in compiler-parallelized code. In Proceedings of the 2nd SUIF Compiler Workshop, 1997.
[12]
N. Hardavellas, I. Pandis, R. Johnson, N. Mancheril, A. Ailamaki, and B. Falsafi. Database servers on chip multiprocessors: Limitations and opportunities. In Proceedings of the Third Biennial Conference on Innovative Data Systems Research (CIDR), 2007.
[13]
W. Ko, M. N. Yankelevsky, D. S. Nikolopoulos, and C. D. Polychronopoulos. Effective cross-platform, multilevel parallelism via dynamic adaptive execution. In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), 2002.
[14]
M. Kulkarni, K. Pingali, B. Walter, G. Ramanarayanan, K. Bala, and L. P. Chew. Optimistic parallelism requires abstractions. In Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation (PLDI), 2007.
[15]
R. Liu, K. Klues, S. Bird, S. Hofmeyr, K. Asanović, and J. Kubiatowicz. Tessellation: Space-time partitioning in a manycore client OS. In Proceedings of the First USENIX Workshop on Hot Topics in Parallelism (HotPar), 2009.
[16]
C.-K. Luk, S. Hong, and H. Kim. Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2009.
[17]
Q. Lv, W. Josephson, Z. Wang, M. Charikar, and K. Li. Ferret: A toolkit for content-based similarity search of feature-rich data. ACM SIGOPS Operating Systems Review, 2006.
[18]
J. Mars, N. Vachharajani, M. L. Soffa, and R. Hundt. Contention aware execution: Online contention detection and response. In Proceedings of the Eighth Annual International Symposium on Code Generation and Optimization (CGO), 2010.
[19]
D. Meisner, B. T. Gold, and T. F. Wenisch. PowerNap: Eliminating server idle power. In Proceedings of the Fourteenth International Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2009.
[20]
A. Moreno, E. César, A. Guevara, J. Sorribes, T. Margalef, and E. Luque. Dynamic Pipeline Mapping (DPM). In Proceedings of the International Euro-Par Conference on Parallel Processing (Euro-Par), 2008.
[21]
A. Navarro, R. Asenjo, S. Tabik, and C. Cascaval. Analytical modeling of pipeline parallelism. In Proceedings of the Eighteenth International Conference on Parallel Architecture and Compilation Techniques (PACT), 2009.
[22]
The OpenMP API specification for parallel programming. http://www.openmp.org .
[23]
H. Pan, B. Hindman, and K. Asanović. Composing parallel software efficiently with Lithe. In Proceedings of the ACM SIGPLAN 2010 Conference on Programming Language Design and Implementation (PLDI), 2010.
[24]
M. K. Prabhu and K. Olukotun. Exposing speculative thread parallelism in SPEC2000. In Proceedings of the 10th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), 2005.
[25]
A. Raman, H. Kim, T. R. Mason, T. B. Jablin, and D. I. August. Speculative parallelization using software multi-threaded transactions. In Proceedings of the Fifteenth International Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2010.
[26]
J. Reinders. Intel Threading Building Blocks. O'Reilly & Associates, Inc., Sebastopol, CA, USA, 2007.
[27]
G. Semeraro, G. Magklis, R. Balasubramonian, D. H. Albonesi, S. Dwarkadas, and M. L. Scott. Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling. In Proceedings of the Eighth International Symposium on High-Performance Computer Architecture (HPCA), 2002.
[28]
Standard Performance Evaluation Corporation (SPEC). http://www.spec.org.
[29]
M. A. Suleman, M. K. Qureshi, Khubaib, and Y. N. Patt. Feedback-directed pipeline parallelism. In Proceedings of the Nineteenth International Conference on Parallel Architecture and Compilation Techniques (PACT), 2010.
[30]
M. A. Suleman, M. K. Qureshi, and Y. N. Patt. Feedback-driven threading: Power-efficient and high-performance execution of multi-threaded workloads on CMPs. In Proceedings of the Thirteenth International Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2008.
[31]
Sybase adaptive server. http://sybooks.sybase.com/nav/base.do.
[32]
J. Tellez and B. Dageville. Method for computing the degree of parallelism in a multi-user environment. United States Patent No. 6,820,262. Oracle International Corporation, 2004.
[33]
The IEEE and The Open Group. The Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004 Edition. 2004.
[34]
C. Tian, M. Feng, V. Nagarajan, and R. Gupta. Copy or discard execution model for speculative parallelization on multicores. In Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2008.
[35]
G. Upadhyaya, V. S. Pai, and S. P. Midkiff. Expressing and exploiting concurrency in networked applications with Aspen. In Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), 2007.
[36]
M. J. Voss and R. Eigenmann. ADAPT: Automated De-Coupled Adaptive Program Transformation. In Proceedings of the 28th International Conference on Parallel Processing (ICPP), 1999.
[37]
Z. Wang and M. F. O'Boyle. Mapping parallelism to multi-cores: A machine learning based approach. In Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), 2009.
[38]
M. Welsh, D. Culler, and E. Brewer. SEDA: An architecture for well-conditioned, scalable internet services. ACM SIGOPS Operating Systems Review, 2001.
[39]
T. Wiegand, G. J. Sullivan, G. Bjontegaard, and A. Luthra. Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology, 2003.
[40]
H. Zhong, M. Mehrara, S. Lieberman, and S. Mahlke. Uncovering hidden loop level parallelism in sequential applications. In Proceedings of the 14th International Symposium on High-Performance Computer Architecture (HPCA), 2008.

Cited By

View all
  • (2021)Smart resource allocation of concurrent execution of parallel applicationsConcurrency and Computation: Practice and Experience10.1002/cpe.660035:17Online publication date: 8-Sep-2021
  • (2021)Reducing the burden of parallel loop schedulers for many‐core processorsConcurrency and Computation: Practice and Experience10.1002/cpe.624133:13Online publication date: 5-Apr-2021
  • (2019)HyperqueuesACM Transactions on Parallel Computing10.1145/33656606:4(1-35)Online publication date: 19-Nov-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PLDI '11: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation
June 2011
668 pages
ISBN:9781450306638
DOI:10.1145/1993498
  • General Chair:
  • Mary Hall,
  • Program Chair:
  • David Padua
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 46, Issue 6
    PLDI '11
    June 2011
    652 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/1993316
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 June 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dynamic
  2. loop nest
  3. loop-level
  4. nested
  5. optimization
  6. parallelism
  7. parallelization
  8. parametric
  9. pipeline
  10. run-time
  11. scheduling
  12. task

Qualifiers

  • Research-article

Conference

PLDI '11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 406 of 2,067 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)2
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Smart resource allocation of concurrent execution of parallel applicationsConcurrency and Computation: Practice and Experience10.1002/cpe.660035:17Online publication date: 8-Sep-2021
  • (2021)Reducing the burden of parallel loop schedulers for many‐core processorsConcurrency and Computation: Practice and Experience10.1002/cpe.624133:13Online publication date: 5-Apr-2021
  • (2019)HyperqueuesACM Transactions on Parallel Computing10.1145/33656606:4(1-35)Online publication date: 19-Nov-2019
  • (2019)Chunking for Dynamic Linear PipelinesACM Transactions on Architecture and Code Optimization10.1145/336381516:4(1-25)Online publication date: 18-Nov-2019
  • (2019)Adaptive Resource Views for ContainersProceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing10.1145/3307681.3325403(243-254)Online publication date: 17-Jun-2019
  • (2019)Your Containers Should be WYSIWYG2019 IEEE International Conference on Services Computing (SCC)10.1109/SCC.2019.00022(56-64)Online publication date: Jul-2019
  • (2019)BOLT: Optimizing OpenMP Parallel Regions with User-Level Threads2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT)10.1109/PACT.2019.00011(29-42)Online publication date: Sep-2019
  • (2019)The power impact of hardware and software actuators on self-adaptable many-core systemsJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2019.05.00697:C(42-53)Online publication date: 1-Aug-2019
  • (2019)Hierarchical adaptive Multi-objective resource management for many-core systemsJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2019.01.00697:C(416-427)Online publication date: 1-Aug-2019
  • (2019)Avoiding Scalability Collapse by Restricting ConcurrencyEuro-Par 2019: Parallel Processing10.1007/978-3-030-29400-7_26(363-376)Online publication date: 26-Aug-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media