Nothing Special   »   [go: up one dir, main page]

Skip to main content

Performance and Power Evaluation of Clustered VLIW Processors with Wide Functional Units

  • Conference paper
Computer Systems: Architectures, Modeling, and Simulation (SAMOS 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3133))

Included in the following conference series:

  • 596 Accesses

Abstract

Architectural resources and program recurrences are themain limitations to the amount of Instruction-Level Parallelism (ILP) exploitable from loops. To increase the number of operations per second, current designs use high degrees of resource replication for memory ports and functional units. But the high costs in terms of power and cycle time of this technique limit the degree of replication.

Clustering is a technique aimed at decentralizing the design of future wide issue cores and enable them to meet the technology constraints in terms of cycle time, area and power. Another way to reduce the complexity of recent cores is using wide functional units. This technique only requires minor modifications to the underlying hardware, but also imposes a penalty on the exploitable parallelism.

In this paper we evaluate a broad range of VLIW configurations that make use of these two techniques. From this study we conclude that applying both techniques yields configurations with very good power-performance efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Berry, M., Chen, D., Koss, P., Kuck, D.: The Perfect Club Benchmarks: Effective Performance Evaluation of Supercomputers, Technical Report 827, CSRD, Univ. of Illinois at Urbana-Champaign (November 1988)

    Google Scholar 

  2. Brooks, D., Tiwari, V., Martsoni, M.: Wattch: A Framework for Architectural- Level Power Analysis and Optimizations, Int’l Symp. on Computer Architecture, ISCA 2000 (2000)

    Google Scholar 

  3. Faraboschi, P., Brown, G., Desoli, G., Homewood, F.: Lx: A technology platform for customizable VLIW embedded processing. In: Proc. 27th Annual Intl. Symp. on Computer Architecture, pp. 203-213 (June 2000)

    Google Scholar 

  4. Friedman, J., Greenfield, Z.: The tigersharc DSP architecture, IEEE Micro, 66-76 (January-February 2000)

    Google Scholar 

  5. Glaskowsky, P.N.: MAP1000 unfolds at Equator. Microprocessor Report. 12(16) (December 1998)

    Google Scholar 

  6. Hrishikesh, M.S., Jouppi, N.P., Farkas, K.I., Burger, D., Keckler, S.W., Shivakumar, P.: The Optimal Logic Depth Per Pipeline Stage is 6 to 8 FO4 Inverter Delays. In: Proc. of the 29 th Symp. on Comp. Arch (ISCA 2002) (May 2002)

    Google Scholar 

  7. Llosa, J., Valero, M., Ayguadé, E., González, A.: Hypernode reduction modulo scheduling. In: Proc. of the 28 th Annual Int. Symp. on Microarchitecture (MICRO- 28), November 1995, pp. 350–360 (1995)

    Google Scholar 

  8. Lòpez, D., Llosa, J., Valero, M., Ayguadé, E.: Cost–Conscious Strategies to Increase Performance of Numerical Programs on Aggressive VLIW Architectures. IEEE Trans. on Comp. 50(10), 1033–1051 (2001)

    Article  Google Scholar 

  9. Lòpez, D., Llosa, J., Valero, M., Ayguadé, E.: Cost-Conscious Strategies to Increase Performance of Numerical Programs on Aggressive VLIW Architectures. IEEE. Trans. on Comp. 50(10), 1033–1051 (2001)

    Article  Google Scholar 

  10. Watanabe, T.: The NEC SX-3 Supercomputer System. In: Proc. ComCon 1991, pp. 303–308 (1991)

    Google Scholar 

  11. White, S.W., Dhawan, S.: POWER2: Next Generation of the RISC System/6000 Family. IBM J. Research and Development 38(5), 493–502 (1994)

    Article  Google Scholar 

  12. Wilton, S.J.E., Jouppi, N.P.: CACTI: An enhanced Cache Access and Cycle Time Model. IEEE. J. Solid-State Circuits 31(5), 677–688 (1996)

    Article  Google Scholar 

  13. Zalamea, J., Llosa, J., Ayguadé, E., Valero, M.: MIRS: Modulo Scheduling with integrated register spilling. In: Proc. of 14th Annual Workshop on Languages and Compilers for Parallel Computing (LCPC 2001) (August 2001)

    Google Scholar 

  14. Zalamea, J., Llosa, J., Ayguadé, E.: andM. Valero. Modulo Scheduling with integrated register spilling for Clustered VLIW Architectures. In: Proc. 34th annual Int. Symp. on Microarch (December 2001)

    Google Scholar 

  15. AltiVec Vectorizes PowerPC Microprocessor Report  12(6) (May 1998)

    Google Scholar 

  16. INTEL, Pentium III Processor: Developer’s Manual, Intel Technology Report (1999), available at http://developer.intel.com/design/PentiumIII

  17. T.I.Inc.: TMS320C62x/67x CPU and Instruction Set Reference Guide (1998)

    Google Scholar 

  18. Rixner, S., Dally, W.J., Khailany, B., Mattson, P., Kapasi, U.J., Owens, J.D.: Register organization for media processing, High-Performance Computer Architecture. In: HPCA-6. Proceedings. Sixth International Symposium on (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pericàs, M., Ayguadé, E., Zalamea, J., Llosa, J., Valero, M. (2004). Performance and Power Evaluation of Clustered VLIW Processors with Wide Functional Units. In: Pimentel, A.D., Vassiliadis, S. (eds) Computer Systems: Architectures, Modeling, and Simulation. SAMOS 2004. Lecture Notes in Computer Science, vol 3133. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27776-7_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-27776-7_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22377-1

  • Online ISBN: 978-3-540-27776-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics