Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture

MICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture

December 2003

2003 Proceeding

Publisher:

IEEE Computer Society
1730 Massachusetts Ave., NW Washington, DC
United States

Conference:

MICRO-36: The 36th Annual International Symposium on MicroarchitectureDecember 3 - 5, 2003

ISBN:

978-0-7695-2043-8

Published:

03 December 2003

Sponsors:

SIGMICRO

Recommend ACM DL

ALREADY A SUBSCRIBER?SIGN IN

Get Alerts for this ConferenceAlerts Save to BinderBinder

Save to Binder

Create a New Binder

Name

Export CitationCitation

Share on

Reflects downloads up to 22 Nov 2024Bibliometrics

Citation Count

2,181

Downloads (6 weeks)

Downloads (12 months)

176

Downloads (cumulative)

30,221

Sections

MICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture

2003

Previous Next

Abstract

No abstract available.

Select All

Export Citations Save to Binder

Article

Message from the General Chair

Page .09

- 0
- 184
Metrics
Total Citations0
Total Downloads184
Last 12 Months1
Last 6 weeks0

Get Access

Article

Message from the Program Chair

Page .10

- 0
- 168
Metrics
Total Citations0
Total Downloads168
Last 12 Months3
Last 6 weeks1

Get Access

Article

Committees

Page .11

- 0
- 151
Metrics
Total Citations0
Total Downloads151
Last 12 Months0
Last 6 weeks0

Get Access

Article

Reviewers

Page .13

- 0
- 176
Metrics
Total Citations0
Total Downloads176
Last 12 Months1
Last 6 weeks0

Get Access

Article

Microarchitecture on the MOSFET Diet

Kerry Bernstein

Page 3

- 0
- 242
Metrics
Total Citations0
Total Downloads242
Last 12 Months0
Last 6 weeks0

Get Access

Article

Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation

Dan Ernst,
Nam Sung Kim,
Shidhartha Das,
Sanjay Pant,
Rajeev Rao,
Toan Pham,
Conrad Ziesler,
David Blaauw,
Todd Austin,
Krisztian Flautner,
Trevor Mudge

Page 7

With increasing clock frequencies and silicon integration,power aware computing has become a critical concernin the design of embedded processors and systems-on-chip.One of the more effective and widely used methods for power-awarecomputing is dynamic ...

- 283
- 2,002
Metrics
Total Citations283
Total Downloads2,002
Last 12 Months20
Last 6 weeks6

Abstract
Get Access

Article

VSV: L2-Miss-Driven Variable Supply-Voltage Scaling for Low Power

Hai Li,
Chen-Yong Cher,
T. N. Vijaykumar,
Kaushik Roy

Page 19

Energy-efficient processor design is becoming moreand more important with technology scaling and with highperformance requirements. Supply-voltage scaling is anefficient way to reduce energy by lowering the operatingvoltage and the clock frequency of ...

- 22
- 262
Metrics
Total Citations22
Total Downloads262
Last 12 Months1
Last 6 weeks0

Abstract
Get Access

Article

A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor

Shubhendu S. Mukherjee,
Christopher Weaver,
Joel Emer,
Steven K. Reinhardt,
Todd Austin

Page 29

Single-event upsets from particle strikes have become akey challenge in microprocessor design. Techniques todeal with these transient faults exist, but come at a cost.Designers clearly require accurate estimates of processorerror rates to make ...

- 215
- 1,418
Metrics
Total Citations215
Total Downloads1,418
Last 12 Months10
Last 6 weeks3

Abstract
Get Access

Article

TLC: Transmission Line Caches

Bradford M. Beckmann,
David A. Wood

Page 43

It is widely accepted that the disproportionate scalingof transistor and conventional on-chip interconnect performancepresents a major barrier to future high performancesystems. Previous research has focused on wire-centricdesigns that use parallelism, ...

- 37
- 452
Metrics
Total Citations37
Total Downloads452
Last 12 Months0
Last 6 weeks0

Abstract
Get Access

Article

Distance Associativity for High-Performance Energy-Efficient Non-Uniform Cache Architectures

Zeshan Chishti,
Michael D. Powell,
T. N. Vijaykumar

Page 55

Wire delays continue to grow as the dominant component oflatency for large caches.A recent work proposed an adaptive,non-uniform cache architecture (NUCA) to manage large, on-chipcaches.By exploiting the variation in access time acrosswidely-spaced ...

- 67
- 613
Metrics
Total Citations67
Total Downloads613
Last 12 Months0
Last 6 weeks0

Abstract
Get Access

Article

Near-Optimal Precharging in High-Performance Nanoscale CMOS Caches

Se-Hyun Yang,
Babak Falsafi

Page 67

High-performance caches statically pull up the bit-linesin all cache subarrays to optimize cache accesslatency. Unfortunately, such an architecture results in asignificant waste of energy in nanoscale CMOS implementationsdue to high leakage and bitline ...

- 4
- 278
Metrics
Total Citations4
Total Downloads278
Last 12 Months0
Last 6 weeks0

Abstract
Get Access

Article

Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction

Rakesh Kumar,
Keith I. Farkas,
Norman P. Jouppi,
Parthasarathy Ranganathan,
Dean M. Tullsen

Page 81

This paper proposes and evaluates single-ISA heterogeneousmulti-core architectures as a mechanism to reduceprocessor power dissipation. Our design incorporatesheterogeneous cores representing different points inthe power/performance design space; during ...

- 226
- 1,883
Metrics
Total Citations226
Total Downloads1,883
Last 12 Months10
Last 6 weeks1

Abstract
Get Access

Article

Runtime Power Monitoring in High-End Processors: Methodology and Empirical Data

Canturk Isci,
Margaret Martonosi

Page 93

With power dissipation becoming an increasingly vexingproblem across many classes of computer systems, measuringpower dissipation of real, running systems has becomecrucial for hardware and software system research and design.Live power measurements are ...

- 135
- 1,911
Metrics
Total Citations135
Total Downloads1,911
Last 12 Months4
Last 6 weeks1

Abstract
Get Access

Article

Power-driven Design of Router Microarchitectures in On-chip Networks

Hangsheng Wang,
Li-Shiuan Peh,
Sharad Malik

Page 105

As demand for bandwidth increases in systems-on-a-chipand chip multiprocessors, networks are fast replacing busesand dedicated wires as the pervasive interconnect fabric foron-chip communication. The tight delay requirements facedby on-chip networks ...

- 99
- 976
Metrics
Total Citations99
Total Downloads976
Last 12 Months0
Last 6 weeks0

Abstract
Get Access

Article

Optimum Power/Performance Pipeline Depth

A. Hartstein,
Thomas R. Puzak

Page 117

The impact of pipeline length on both the power andperformance of a microprocessor is explored boththeoretically and by simulation. A theory is presented fora wide range of power/performance metrics, BIPSm/W.The theory shows that the more important ...

- 26
- 905
Metrics
Total Citations26
Total Downloads905
Last 12 Months3
Last 6 weeks1

Abstract
Get Access

Article

Processor Acceleration Through Automated Instruction Set Customization

Nathan Clark,
Hongtao Zhong,
Scott Mahlke

Page 129

Application-specific extensions to the computational capabilities of a processor provide an efficient mechanism to meetthe growing performance and power demands of embeddedapplications. Hardware, in the form of new function units(or co-processors), and ...

- 79
- 578
Metrics
Total Citations79
Total Downloads578
Last 12 Months0
Last 6 weeks0

Abstract
Get Access

Article

The Reconfigurable Streaming Vector Processor (RSVPTM)

Silviu Ciricescu,
Ray Essick,
Brian Lucas,
Phil May,
Kent Moat,
Jim Norris,
Michael Schuette,
Ali Saidi

Page 141

The need to process multimedia data places largecomputational demands on portable/embedded devices.These multimedia functions share commoncharacteristics: they are computationally intensive anddata-streaming, performing the same operation(s) onmany data ...

- 41
- 948
Metrics
Total Citations41
Total Downloads948
Last 12 Months2
Last 6 weeks0

Abstract
Get Access

Article

Scaling and Charact rizing Database Workloads: Bridging the Gap between Research and Practice

Richard A. Hankins,
Trung Diep,
Murali Annavaram,
Brian Hirano,
Harald Eri,
Hubert Nueckel,
John P. Shen

Page 151

On-ine Transaction Processing (OLTP) workloads arecrucial benchmarks for the design and analysis of serverprocessors. Typical cached configurations used byresearchers to simulate OLTP workloads are orders ofmagnitude smaller than the fully scaled ...

- 26
- 698
Metrics
Total Citations26
Total Downloads698
Last 12 Months1
Last 6 weeks1

Abstract
Get Access

Article

In Memory of Bob Rau

Michael Schlansker

Page 165

- 0
- 278
Metrics
Total Citations0
Total Downloads278
Last 12 Months1
Last 6 weeks0

Get Access

Article

Generational Cache Management of Code Traces in Dynamic Optimization Systems

Kim Hazelwood,
Michael D. Smith

Page 169

A dynamic optimizer is a runtime software system thatgroups a program's instruction sequences into traces, optimizesthose traces, stores the optimized traces in a software-basedcode cache, and then executes the optimized code inthe code cache. To ...

- 15
- 401
Metrics
Total Citations15
Total Downloads401
Last 12 Months0
Last 6 weeks0

Abstract
Get Access

Article

The Performance of Runtime Data Cache Prefetching in a Dynamic Optimization System

Jiwei Lu,
Howard Chen,
Rao Fu,
Wei-Chung Hsu,
Bobbie Othmer,
Pen-Chung Yew,
Dong-Yuan Chen

Page 180

Traditional software controlled data cache prefetching isoften ineffective due to the lack of runtime cache miss andmiss address information. To overcome this limitation, weimplement runtime data cache prefetching in the dynamicoptimization system ADORE ...

- 38
- 1,005
Metrics
Total Citations38
Total Downloads1,005
Last 12 Months7
Last 6 weeks1

Abstract
Get Access

Article

IA-32 Execution Layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium®-based systems

Leonid Baraz,
Tevi Devor,
Orna Etzion,
Shalom Goldenberg,
Alex Skaletsky,
Yun Wang,
Yigel Zemach

Page 191

IA-32 Execution Layer (IA-32 EL) is a newtechnology that executes IA-32 applications onIntel® Itanium® processor family systems.Currently, support for IA-32 applications onItanium-based platforms is achieved usinghardware circuitry on the Itanium ...

- 71
- 714
Metrics
Total Citations71
Total Downloads714
Last 12 Months3
Last 6 weeks1

Abstract
Get Access

Article

LLVA: A Low-level Virtual Instruction Set Architecture

Vikram Adve,
Chris Lattner,
Michael Brukman,
Anand Shukla,
Brian Gaeke

Page 205

A virtual instruction set architecture (V-ISA) implementedvia a processor-specific software translation layercan provide great flexibility to processor designers. Recentexamples such as Crusoe and DAISY, however, haveused existing hardware instruction ...

- 30
- 888
Metrics
Total Citations30
Total Downloads888
Last 12 Months4
Last 6 weeks0

Abstract
Get Access

Article

Comparing Program Phase Detection Techniques

Ashutosh S. Dhodapkar,
James E. Smith

Page 217

Detecting program phase changes accurately is an importantaspect of dynamically adaptable systems. Threedynamic program phase detection techniques are compared- using instruction working sets, basic block vectors(BBV), and conditional branch counts. ...

- 81
- 1,121
Metrics
Total Citations81
Total Downloads1,121
Last 12 Months0
Last 6 weeks0

Abstract
Get Access

Article

Using Interaction Costs for Microarchitectural Bottleneck Analysis

Brian A. Fields,
Rastislav Bodík,
Mark D. Hill,
Chris J. Newburn

Page 228

Attacking bottlenecks in modern processors is difficultbecause many microarchitectural events overlap witheach other. This parallelism makes it difficult to both(a) assign a cost to an event (e.g., to one of two overlappingcache misses) and (b) assign ...

- 24
- 442
Metrics
Total Citations24
Total Downloads442
Last 12 Months1
Last 6 weeks0

Abstract
Get Access

Article

Fast Path-Based Neural Branch Prediction

Daniel A. Jiménez

Page 243

Microarchitectural prediction based on neural learninghas received increasing attention in recent years. However,neural prediction remains impractical because its superioraccuracy over conventional predictors is not enough to offsetthe cost imposed by ...

- 31
- 920
Metrics
Total Citations31
Total Downloads920
Last 12 Months3
Last 6 weeks2

Abstract
Get Access

Article

Hardware Support for Control Transfers in Code Caches

Ho-Seop Kim,
James E. Smith

Page 253

Many dynamic optimization and/or binary translationsystems hold optimized/translated superblocks in a codecache. Conventional code caching systems suffer fromoverheads when control is transferred from one cachedsuperblock to another, especially via ...

- 20
- 424
Metrics
Total Citations20
Total Downloads424
Last 12 Months2
Last 6 weeks1

Abstract
Get Access

Article

Exploiting Value Locality in Physical Register Files

Saisanthosh Balakrishnan,
Gurindar S. Sohi

Page 265

The physical register file is an important component of adynamically-scheduled processor. Increasing the amount of parallelismplaces increasing demands on the physical register file,calling for alternative file organization and management ...

- 21
- 382
Metrics
Total Citations21
Total Downloads382
Last 12 Months0
Last 6 weeks0

Abstract
Get Access

Article

Macro-op Scheduling: Relaxing Scheduling Loop Constraints

Ilhyun Kim,
Mikko H. Lipasti

Page 277

Ensuring back-to-back execution of dependent instructionsin a conventional out-of-order processor requiresscheduling logic that wakes up and selects instructions atthe same rate as they are executed. To sustain high performance,integer ALU instructions ...

- 29
- 248
Metrics
Total Citations29
Total Downloads248
Last 12 Months0
Last 6 weeks0

Abstract
Get Access

Article

WaveScalar

Steven Swanson,
Ken Michelson,
Andrew Schwerin,
Mark Oskin

Page 291

Silicon technology will continue to provide an exponential increasein the availability of raw transistors. Effectively translatingthis resource into application performance, however,is an open challenge. Ever increasing wire-delay relativeto switching ...

- 111
- 1,603
Metrics
Total Citations111
Total Downloads1,603
Last 12 Months57
Last 6 weeks7

Abstract
Get Access

Save to Binder

Create a New Binder

Name

Index Terms

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Recommendations

MICRO 35: Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
ISCA '09: Proceedings of the 36th annual international symposium on Computer architecture
MICRO-50 '17: Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture

Acceptance Rates

MICRO 36 Paper Acceptance Rate 35 of 134 submissions, 26%;

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Year	Submitted	Accepted	Rate
MICRO-48	283	61	22%
MICRO-47	279	53	19%
MICRO-46	239	39	16%
MICRO 41	210	40	19%
MICRO 40	166	35	21%
MICRO 39	174	42	24%
MICRO 38	147	29	20%
MICRO 37	158	29	18%
MICRO 36	134	35	26%
MICRO 33	110	31	28%
MICRO 32	131	27	21%
MICRO 31	108	28	26%
MICRO 30	103	35	34%
Overall	2,242	484	22%

Export Citations

Select Citation format

Please download or close your previous search result export first before starting a new bulk export.
Preview is not available.
By clicking download,a status dialog will open to start the export process. The process may takea few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress.
Download
- Download citation
- Copy citation

Save to Binder

Sections

Save to Binder

Index Terms

Recommendations

MICRO 35: Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture

ISCA '09: Proceedings of the 36th annual international symposium on Computer architecture

MICRO-50 '17: Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture

Acceptance Rates