Nothing Special   »   [go: up one dir, main page]

skip to main content
Volume 34, Issue 2May 2006
Reflects downloads up to 14 Nov 2024Bibliometrics
article
Message from the General Chair
article
Message from the Program Chair
article
Reviewers
article
SIGARCH Guidelines
article
A Gracefully Degrading and Energy-Efficient Modular Router Architecture for On-Chip Networks

Packet-based on-chip networks are increasingly being adopted in complex System-on-Chip (SoC) designs supporting numerous homogeneous and heterogeneous functional blocks. These Network-on-Chip (NoC) architectures are required to not only provide ultra-...

article
The BlackWidow High-Radix Clos Network

This paper describes the radix-64 folded-Clos network of the Cray BlackWidow scalable vector multiprocessor. We describe the BlackWidow network which scales to 32K processors with a worstcase diameter of seven hops, and the underlying high-radix router ...

article
Memory Model = Instruction Reordering + Store Atomicity

We present a novel framework for defining memory models in terms of two properties: thread-local Instruction Reordering axioms and Store Atomicity, which describes inter-thread communication via memory. Most memory models have the store atomicity ...

article
Conditional Memory Ordering

Conventional relaxed memory ordering techniques follow a proactive model: at a synchronization point, a processor makes its own updates to memory available to other processors by executing a memory barrier instruction, ensuring that recent writes have ...

article
Architectural Semantics for Practical Transactional Memory

Transactional Memory (TM) simplifies parallel programming by allowing for parallel execution of atomic tasks. Thus far, TM systems have focused on implementing transactional state buffering and conflict resolution. Missing is a robust hardware/software ...

article
Ensemble-level Power Management for Dense Blade Servers

One of the key challenges for high-density servers (e.g., blades) is the increased costs in addressing the power and heat density associated with compaction. Prior approaches have mainly focused on reducing the heat generated at the level of an ...

article
Techniques for Multicore Thermal Management: Classification and New Exploration

Power density continues to increase exponentially with each new technology generation, posing a major challenge for thermal management in modern processors. Much past work has examined microarchitectural policies for reducing total chip power, but these ...

article
SODA: A Low-power Architecture For Software Radio

The physical layer of most wireless protocols is traditionally implemented in custom hardware to satisfy the heavy computational requirements while keeping power consumption to a minimum. These implementations are time consuming to design and difficult ...

article
An Integrated Framework for Dependable and Revivable Architectures Using Multicore Processors

This paper presents a high-availability system architecture called INDRA an INtegrated framework for Dependable and Revivable Architecture that enhances a multicore processor (or CMP) with novel security and fault recovery mechanisms. INDRA represents ...

article
Multiple Instruction Stream Processor

Microprocessor design is undergoing a major paradigm shift towards multi-core designs, in anticipation that future performance gains will come from exploiting threadlevel parallelism in the software. To support this trend, we present a novel processor ...

article
Design and Management of 3D Chip Multiprocessors Using Network-in-Memory

Long interconnects are becoming an increasingly important problem from both power and performance perspectives. This motivates designers to adopt on-chip network-based communication infrastructures and three-dimensional (3D) designs where multiple ...

article
Slackened Memory Dependence Enforcement: Combining Opportunistic Forwarding with Decoupled Verification

An efficient mechanism to track and enforce memory dependences is crucial to an out-of-order microprocessor. The conventional approach of using cross-checked load queue and store queue, while very effective in earlier processor incarnations, suffers ...

article
Balanced Cache: Reducing Conflict Misses of Direct-Mapped Caches

Level one cache normally resides on a processor's critical path, which determines the clock frequency. Directmapped caches exhibit fast access time but poor hit rates compared with same sized set-associative caches due to nonuniform accesses to the ...

article
A Case for MLP-Aware Cache Replacement

Performance loss due to long-latency memory accesses can be reduced by servicing multiple memory accesses concurrently. The notion of generating and servicing long-latency cache misses in parallel is called Memory Level Parallelism (MLP). MLP is not ...

article
Improving Cost, Performance, and Security of Memory Encryption and Authentication

Protection from hardware attacks such as snoopers and mod chips has been receiving increasing attention in computer architecture. This paper presents a new combined memory encryption/authentication scheme. Our new split counters for counter-mode ...

article
A Scalable Architecture For High-Throughput Regular-Expression Pattern Matching

We present and evaluate an architecture for highthroughput pattern matching of regular expressions. Our approach matches multiple patterns concurrently, responds rapidly to changes in the pattern set, and is well suited for synthesis in an ASIC or FPGA. ...

article
Chisel: A Storage-efficient, Collision-free Hash-based Network Processing Architecture

Longest Prefix Matching (LPM) is a fundamental part of various network processing tasks. Previously proposed approaches for LPM result in prohibitive cost and power dissipation (TCAMs) or in large memory requirements and long lookup latencies (tries), ...

article
Tolerating Dependences Between Large Speculative Threads Via Sub-Threads

Thread-level speculation (TLS) has proven to be a promising method of extracting parallelism from both integer and scientific workloads, targeting speculative threads that range in size from hundreds to several thousand dynamic instructions and have ...

article
Bulk Disambiguation of Speculative Threads in Multiprocessors

Transactional Memory (TM), Thread-Level Speculation (TLS), and Checkpointed multiprocessors are three popular architectural techniques based on the execution of multiple, cooperating speculative threads. In these environments, correctly maintaining data ...

article
Learning-Based SMT Processor Resource Distribution via Hill-Climbing

The key to high performance in Simultaneous Multithreaded (SMT) processors lies in optimizing the distribution of shared resources to active threads. Existing resource distribution techniques optimize performance only indirectly. They infer potential ...

article
Spatial Memory Streaming

Prior research indicates that there is much spatial variation in applications' memory access patterns. Modern memory systems, however, use small fixed-size cache blocks and as such cannot exploit the variation. Increasing the block size would not only ...

article
Cooperative Caching for Chip Multiprocessors

This paper presents CMP Cooperative Caching, a unified framework to manage a CMP's aggregate on-chip cache resources. Cooperative caching combines the strengths of private and shared cache organizations by forming an aggregate "shared" cache through ...

article
Reducing Startup Time in Co-Designed Virtual Machines

A Co-Designed Virtual Machine allows designers to implement a processor via a combination of hardware and software. Dynamic binary translation converts code written for a conventional (legacy) ISA into optimized code for an underlying implementation-...

article
TRAP-Array: A Disk Array Architecture Providing Timely Recovery to Any Point-in-time

RAID architectures have been used for more than two decades to recover data upon disk failures. Disk failure is just one of the many causes of damaged data. Data can be damaged by virus attacks, user errors, defective software/firmware, hardware faults, ...

Subjects

Comments

Please enable JavaScript to view thecomments powered by Disqus.