Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/379240acmconferencesBook PagePublication PagesiscaConference Proceedingsconference-collections
ISCA '01: Proceedings of the 28th annual international symposium on Computer architecture
ACM2001 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
ISCA01: 28th International Symposium on Computer Architecture Göteborg Sweden 30 June 2001- 4 July 2001
ISBN:
978-0-7695-1162-7
Published:
01 June 2001
Sponsors:
SIGARCH, IEEE-CS\TCCA
Next Conference
Reflects downloads up to 21 Sep 2024Bibliometrics
Abstract

No abstract available.

Skip Table Of Content Section
Article
Reviewers
Article
Execution-based prediction using speculative slices

A relatively small set of static instructions has significant leverage on program execution performance. These problem instructions contribute a disproportionate number of cache misses and branch mispredictions because their behavior cannot be ...

Article
Article
Speculative precomputation: long-range prefetching of delinquent loads

This paper explores Speculative Precomputation, a technique that uses idle thread context in a multithreaded architecture to improve performance of single-threaded applications. It attacks program stalls from data cache misses by pre-computing future ...

Article
Dynamically allocating processor resources between nearby and distant ILP

Modern superscalar processors use wide instruction issue widths and out-of-order execution in order to increase instruction-level parallelism (ILP). Because instructions must be committed in order so as to guarantee precise exceptions, increasing ILP ...

Article
Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors

Hardly predictable data addresses in many irregular applications have rendered prefetching ineffective. In many cases, the only accurate way to predict these addresses is to directly execute the code that generates them. As multithreaded architectures ...

Article
Data prefetching by dependence graph precomputation

Data cache misses reduce the performance of wide-issue processors by stalling the data supply to the processor. Prefetching data by predicting the miss address is one way to tolerate the cache miss latencies. But current applications with irregular ...

Article
Concurrency, latency, or system overhead: which has the largest impact on uniprocessor DRAM-system performance?

Given a fixed CPU architecture and a fixed DRAM timing specification, there is still a large design space for a DRAM system organization. Parameters include the number of memory channels, the bandwidth of each channel, burst sizes, queue sizes and ...

Article
Focusing processor policies via critical-path prediction

Although some instructions hurt performance more than others, current processors typically apply scheduling and speculation as if each instruction was equally costly. Instruction cost can be naturally expressed through the critical path: if we could ...

Article
Automated design of finite state machine predictors for customized processors

Customized processors use compiler analysis and design automation techniques to take a generalized architectural model and create a specific instance of it which is optimized to a given application or set of applications. These processors offer the ...

Article
Better exploration of region-level value locality with integrated computation reuse and value prediction

Computation-reuse and value-prediction are two recent techniques for improving microprocessor performance by exploiting value localities. They both aim at breaking the data dependence limit in traditional processors. In this paper, we propose a ...

Article
CryptoManiac: a fast flexible architecture for secure communication

The growth of the Internet as a vehicle for secure communication and electronic commerce has brought cryptographic processing performance to the forefront of high throughput system design. This trend will be further underscored with the widespread ...

Article
QoS provisioning in clusters: an investigation of Router and NIC design

Design of high performance cluster networks (routers) with Quality-of-Service (QoS) guarantees is becoming increasingly important to support a variety of multimedia applications, many of which have real-time constraints. Most commercial routers, which ...

Article
Locality vs. criticality

Current memory hierarchies exploit locality of references to reduce load latency and thereby improve processor performance. Locality based schemes aim at reducing the number of cache misses and tend to ignore the nature of misses. This leads to a ...

Article
Dead-block prediction & dead-block correlating prefetchers

Effective data prefetching requires accurate mechanisms to predict both “which” cache blocks to prefetch and “when” to prefetch them. This paper proposes the Dead-Block Predictors (DBPs), trace-based predictors that accurately identify “when” an Ll data ...

Article
Code layout optimizations for transaction processing workloads

Commercial applications such as databases and Web servers constitute the most important market segment for high-performance servers. Among these applications, on-line transaction processing (OLTP) workloads provide a challenging set of requirements for ...

Article
Exploring and exploiting wire-level pipelining in emerging technologies

Pipelining is a technique that has long since been considered fundamental by computer architects. However, the world of nanoelectronics is pushing the idea of pipelining to new and lower levels — particularly the device level. How this affects circuits ...

Article
NanoFabrics: spatial computing using molecular electronics

The continuation of the remarkable exponential increases in processing power over the recent past faces imminent challenges due in part to the physics of deep-submicron CMOS devices and the costs of both chip masks and future fabrication plants. A ...

Article
A simple method for extracting models for protocol code

The use of model checking for validation requires that models of the underlying system be created. Creating such models is both difficult and error prone and as a result, verification is rarely used despite its advantages. In this paper, we present a ...

Article
Removing architectural bottlenecks to the scalability of speculative parallelization

Speculative thread-level parallelization is a promising way to speed up codes that compilers fail to parallelize. While several speculative parallelization schemes have been proposed for different machine sizes and types of codes, the results so far ...

Article
Power and energy reduction via pipeline balancing

Minimizing power dissipation is an important design requirement for both portable and non-portable systems. In this work, we propose an architectural solution to the power problem that retains performance while reducing power. The technique, known as ...

Article
Energy-effective issue logic

The issue logic of a dynamically-scheduled superscalar processor is a complex mechanism devoted to start the execution of multiple instructions every cycle. Due to its complexity, it is responsible for a significant percentage of the energy consumed by ...

Article
Cache decay: exploiting generational behavior to reduce cache leakage power

Power dissipation is increasingly important in CPUs ranging from those intended for mobile use, all the way up to high-performance processors for high-end servers. While the bulk of the power dissipated is dynamic switching power, leakage power is also ...

Article
Variability in the execution of multimedia applications and implications for architecture

Multimedia applications are an increasingly important workload for general-purpose processors. This paper analyzes frame-level execution time variability for several multimedia applications on general-purpose architectures. There are two reasons for ...

Article
Measuring Experimental Error in Microprocessor Simulation

Abstract: We measure the experimental error that arises from the use of non-validated simulators in computer architecture research, with the goal of increasing the rigor of simulation- based studies. We describe the methodology that we used to validate ...

Article
Rapid profiling via stratified sampling

Sophisticated binary translators and dynamic optimizers demand a program profiler with low overhead, high accuracy, and the ability to collect a variety of profile types. A profiling scheme that achieves these goals is proposed. Conceptually, the ...

Article
Contributors
  • Chalmers University of Technology

Index Terms

  1. Proceedings of the 28th annual international symposium on Computer architecture
        Please enable JavaScript to view thecomments powered by Disqus.

        Recommendations

        Acceptance Rates

        ISCA '01 Paper Acceptance Rate 24 of 163 submissions, 15%;
        Overall Acceptance Rate 543 of 3,203 submissions, 17%
        YearSubmittedAcceptedRate
        ISCA '224006717%
        ISCA '193656217%
        ISCA '173225417%
        ISCA '132885619%
        ISCA '122624718%
        ISCA '082593714%
        ISCA '062343113%
        ISCA '051944523%
        ISCA '042173114%
        ISCA '031843620%
        ISCA '021802715%
        ISCA '011632415%
        ISCA '991352619%
        Overall3,20354317%