Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/502217.502241acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
Article

Tailoring pipeline bypassing and functional unit mapping to application in clustered VLIW architectures

Published: 16 November 2001 Publication History

Abstract

In this paper we describe a design exploration methodology for clustered VLIW architectures. The central idea of this work is a set of three techniques aimed at reducing the cost of expensive inter-cluster copy operations. Instruction scheduling is performed using a list-scheduling algorithm that stores operand chains into the same register file. Functional units are assigned to clusters based on the application inter-cluster communication pattern. Finally, a careful insertion of pipeline bypasses is used to increase the number of data-dependencies that can be satisfied by pipeline register operands. Experimental results, using the SPEC95 benchmark and the IMPACT compiler, reveal a substantial reduction in the number of copies between clusters.

References

[1]
A. Abnous and N.Bagherzadeh.Pipelining and bypassing in a VLIW processor.IEEE Trans. on Parallel and Distributed Systems 5(6):658 -663,June 1994.
[2]
A.Abnous and N.Bagherzadeh.Architectural design and analysis of a VLIWprocessor.International Journal of Computers and Electrical Engineering 21(2):119 -142,1995.
[3]
P.S.Ahuja,D.W.Clark,and A.Rogers.The performance impact of incomplete bypassing in processor pipelines.In MICRO-28 1995.
[4]
A.Capitanio,N.Dutt,and A.Nicolau.Design considerations for limited connectivity VLIW architectures.Technical Report TR-92-59,University of California,Irvine,Irvine,CA 92717,1992.
[5]
A.Capitanio,N.Dutt,and A.Nicolau.Partitioned register .le for VLIWs:A preliminary analysis of tradeo .s.In 25th International Symposium on Microarchitecture (MICRO),1992.
[6]
J.R.Ellis.Bulldog: A Compiler for VLIW Architectures MIT Press,1986.
[7]
P.Faraboshchi,G.Desoli,and J.A.Fisher.Clustered instruction-level parallel processors.Technical Report Technical Report HPL-98-204,HP Labs,USA,1998.
[8]
M.M.Fernandes,J.Llosa,and N.Topham. Partitioned schedules for clustered VLIW architectures.In IEEE/ACM International Parallel Processing Symposium 1998.
[9]
J.A.Fisher.Trace scheduling:A technique for global microcode compaction.IEEE Trans. on Computers C-30(7):478 -490,July 1981.
[10]
W.W.Hwu et al.Impact advanced compiler technology. http://www.crhc.uiuc.edu/IMPACT/index.html.
[11]
M .F.Jacome,G.de Veciana,and V.Lapinskii. Exploring performance tradeo .s for clustered VLIW asips.In International Conference on Computer-Aided Design 2000.
[12]
C.Lee,C.Park,and M.Kim.E .cient algorithm for graph partitioning problem using a problem transformation method.Computer Aided Design 21(10):611,December 1989.
[13]
S.S.Muchnick.Advanced Compiler Design and Implementation Morgan Kaufmann,1997.
[14]
E.Ozer,S.Banerjia,and T.M.Conte.Uni .ed assign and schedule:A new approach to scheduling for clustered register .le microarchitectures.In 31th International Symposium on Microarchitecture (MICRO),1998.
[15]
E.Ozer and T.M.Conte.Optimal cluster scheduling for a VLIWmachine.Technical report,Dept.of Elec. and Comp.Eng.,North Carolina State University, 1998.
[16]
E.Ozer and T.M.Conte.Uni .ed cluster assignment and instruction scheduling for clustered VLIW microarchitectures.Technical report,Dept.of Elec. and Comp.Eng.,North Carolina State University, 1998.
[17]
V.K.R.Rau and S.Aditya.Machine-description driven compilers for EPIC and VLIW processors. Design Automation for Embedded Systems 4(2/3):71 -118,1999.
[18]
J.Sanchez and A.Gonzalez.The e .ectiveness of loop unrolling for modulo scheduling in clustered VLIW architectures.In Intl. Conference on Parallel Processing (ICPP),2000.
[19]
J.Sanchez and A.Gonzalez.Instruction scheduling for clustered VLIWarchitectures.In Intl. Symposium on System Synthesis (ISSS), 2000.

Cited By

View all
  • (2007)Automatic Design Space Exploration of Register Bypasses in Embedded ProcessorsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2007.90706626:12(2102-2115)Online publication date: 1-Dec-2007
  • (2006)Retargetable pipeline hazard detection for partially bypassed processorsIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2006.87846814:8(791-801)Online publication date: 1-Aug-2006
  • (2005)PBExploreProceedings of the conference on Design, Automation and Test in Europe - Volume 210.1109/DATE.2005.236(1264-1269)Online publication date: 7-Mar-2005
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CASES '01: Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
November 2001
258 pages
ISBN:1581133995
DOI:10.1145/502217
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 November 2001

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 52 of 230 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)1
Reflects downloads up to 26 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2007)Automatic Design Space Exploration of Register Bypasses in Embedded ProcessorsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2007.90706626:12(2102-2115)Online publication date: 1-Dec-2007
  • (2006)Retargetable pipeline hazard detection for partially bypassed processorsIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2006.87846814:8(791-801)Online publication date: 1-Aug-2006
  • (2005)PBExploreProceedings of the conference on Design, Automation and Test in Europe - Volume 210.1109/DATE.2005.236(1264-1269)Online publication date: 7-Mar-2005
  • (2004)FLASHProceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization10.5555/977395.977671Online publication date: 20-Mar-2004
  • (2004)Operation tables for scheduling in the presence of incomplete bypassingProceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis10.1145/1016720.1016768(194-199)Online publication date: 8-Sep-2004
  • (2004)FLASH: foresighted latency-aware scheduling heuristic for processors with customized datapathsInternational Symposium on Code Generation and Optimization, 2004. CGO 2004.10.1109/CGO.2004.1281675(201-212)Online publication date: 2004
  • (2003)Systematic register bypass customization for application-specific processorsProceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors. ASAP 200310.1109/ASAP.2003.1212830(64-74)Online publication date: 2003

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media