Nothing Special   »   [go: up one dir, main page]

Skip to main content

Exploiting the Cell/BE Architecture with the StarPU Unified Runtime System

  • Conference paper
Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5657))

Included in the following conference series:

Abstract

Core specialization is currently one of the most promising ways for designing power-efficient multicore chips. However, approaching the theoretical peak performance of such heterogeneous multicore architectures with specialized accelerators, is a complex issue. While substantial effort has been devoted to efficiently offloading parts of the computation, designing an execution model that unifies all computing units is the main challenge.

We therefore designed the StarPU  runtime system for providing portable support for heterogeneous multicore processors to high performance applications and compiler environments. StarPU  provides a high-level, unified execution model which is tightly coupled to an expressive data management library. In addition to our previous results on using multicore processors alongside with graphic processors, we show that StarPU  is flexible enough to efficiently exploit the heterogeneous resources in the Cell  processor. We present a scalable design supporting multiple different accelerators while minimizing the overhead on the overall system. Using experiments with classical linear algebra algorithms, we show that StarPU  improves programmability and provides performance portability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Augonnet, C., Namyst, R.: A unified runtime system for heterogeneous multicore architectures. In: Highly Parallel Processing on a Chip (2008)

    Google Scholar 

  2. Bellens, P., Perez, J.M., Badia, R.M., Labarta, J.: CellSs: a programming model for the cell BE architecture. In: ACM/IEEE conference on SuperComputing (2006)

    Google Scholar 

  3. Crawford, C.H., Henning, P., Kistler, M., Wright, C.: Accelerating computing with the Cell Broadband Engine processor. In: Conference on Computing Frontiers (2008)

    Google Scholar 

  4. Dolbeau, R., Bihan, S., Bodin, F.: HMPP: A Hybrid Multi-core Parallel Programming Environment. Technical report, CAPS entreprise (2007)

    Google Scholar 

  5. Fatahalian, K., Knight, T.J., Houston, M., Erez, M., Reiter Horn, D., Leem, L., Young Park, J., Ren, M., Aiken, A., Dally, W.J., Hanrahan, P.: Sequoia: Programming the Memory Hierarchy. In: ACM/IEEE Conference on Supercomputing (2006)

    Google Scholar 

  6. Kunzman, D., Zheng, G., Bohm, E., Kalé, L.V.: Charm++, Offload API, and the Cell Processor. In: Proceedings of the Workshop on Programming Models for Ubiquitous Parallelism, Seattle, WA, USA (September 2006)

    Google Scholar 

  7. Kurzak, J., Buttari, A., Dongarra, J.: Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization. IEEE Transactions on Parallel and Distributed Systems 19(9) (2008)

    Google Scholar 

  8. McCool, M.D.: Data-Parallel Programming on the Cell BE and the GPU using the RapidMind Development Platform. In: GSPx Multicore Applications Conference (2006)

    Google Scholar 

  9. Nijhuis, M., Bos, H., Bal, H., Augonnet, C.: Mapping and synchronizing streaming applications on Cell processors. In: International Conference on High Performance Embedded Architectures & Compilers (2009)

    Google Scholar 

  10. Ohara, M., Inoue, H., Sohda, Y., Komatsu, H., Nakatani, T.: MPI Microtask for programming the Cell Broadband Engine processor. IBM Syst. J. 45(1) (2006)

    Google Scholar 

  11. Schneider, S., Yeom, J.S., Rose, B., Linford, J.C., Sandu, A., Nikolopoulos, D.S.: A comparison of programming models for multiprocessors with explicitly managed memory hierarchies. In: PPoPP 2009 Proceedings. ACM, New York (2008)

    Google Scholar 

  12. Wesolowski, L.: An Application Programming Interface for General Purpose Graphics Processing Units in an Asynchronous Runtime System. Master’s thesis (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 IFIP International Federation for Information Processing

About this paper

Cite this paper

Augonnet, C., Thibault, S., Namyst, R., Nijhuis, M. (2009). Exploiting the Cell/BE Architecture with the StarPU Unified Runtime System. In: Bertels, K., Dimopoulos, N., Silvano, C., Wong, S. (eds) Embedded Computer Systems: Architectures, Modeling, and Simulation. SAMOS 2009. Lecture Notes in Computer Science, vol 5657. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03138-0_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03138-0_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03137-3

  • Online ISBN: 978-3-642-03138-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics