SIGPLAN: Vol 39, No 11

article

Programming with transactional coherence and consistency (TCC)

Pages 1–13https://doi.org/10.1145/1037187.1024395

Transactional Coherence and Consistency (TCC) offers a way to simplify parallel programming by executing all code within transactions. In TCC systems, transactions serve as the fundamental unit of parallel work, communication and coherence. As each ...

article

Spatial computation

Pages 14–26https://doi.org/10.1145/1037187.1024396

This paper describes a computer architecture, Spatial Computation (SC), which is based on the translation of high-level language programs directly into hardware structures. SC program implementations are completely distributed, with no centralized ...

article

An ultra low-power processor for sensor networks

Pages 27–36https://doi.org/10.1145/1037187.1024397

We present a novel processor architecture designed specifically for use in low-power wireless sensor-network nodes. Our sensor network asynchronous processor (SNAP/LE) is based on an asynchronous data-driven 16-bit RISC core with an extremely low-power ...

SESSION: Storage

article

D-SPTF: decentralized request distribution in brick-based storage systems

Pages 37–47https://doi.org/10.1145/1037187.1024399

Distributed Shortest-Positioning Time First (D-SPTF) is a request distribution protocol for decentralized systems of storage servers. D-SPTF exploits high-speed interconnects to dynamically select which server, among those with a replica, should service ...

article

FAB: building distributed enterprise disk arrays from commodity components

Pages 48–58https://doi.org/10.1145/1037187.1024400

This paper describes the design, implementation, and evaluation of a Federated Array of Bricks (FAB), a distributed disk array that provides the reliability of traditional enterprise arrays with lower cost and better scalability. FAB is built from a ...

article

Deconstructing storage arrays

Pages 59–71https://doi.org/10.1145/1037187.1024401

We introduce Shear, a user-level software tool that characterizes RAID storage arrays. Shear employs a set of controlled algorithms combined with statistical techniques to automatically determine the important properties of a RAID system, including the ...

SESSION: Security

article

HIDE: an infrastructure for efficiently protecting information leakage on the address bus

Pages 72–84https://doi.org/10.1145/1037187.1024403

XOM-based secure processor has recently been introduced as a mechanism to provide copy and tamper resistant execution. XOM provides support for encryption/decryption and integrity checking. However, neither XOM nor any other current approach adequately ...

article

Secure program execution via dynamic information flow tracking

Pages 85–96https://doi.org/10.1145/1037187.1024404

We present a simple architectural mechanism called dynamic information flow tracking that can significantly improve the security of computing systems with negligible performance overhead. Dynamic information flow tracking protects programs against ...

SESSION: Architecture

article

Coherence decoupling: making use of incoherence

Pages 97–106https://doi.org/10.1145/1037187.1024406

This paper explores a new technique called coherence decoupling, which breaks a traditional cache coherence protocol into two protocols: a Speculative Cache Lookup (SCL) protocol and a safe, backing coherence protocol. The SCL protocol produces a ...

article

Continual flow pipelines

Pages 107–119https://doi.org/10.1145/1037187.1024407

Increased integration in the form of multiple processor cores on a single die, relatively constant die sizes, shrinking power envelopes, and emerging applications create a new challenge for processor architects. How to build a processor that provides ...

article

Scalable selective re-execution for EDGE architectures

Pages 120–132https://doi.org/10.1145/1037187.1024408

Pipeline flushes are becoming increasingly expensive in modern microprocessors with large instruction windows and deep pipelines. Selective re-execution is a technique that can reduce the penalty of mis-speculations by re-executing only instructions ...

SESSION: Potpourri

article

HOIST: a system for automatically deriving static analyzers for embedded systems

Pages 133–143https://doi.org/10.1145/1037187.1024410

Embedded software must meet conflicting requirements such as be-ing highly reliable, running on resource-constrained platforms, and being developed rapidly. Static program analysis can help meet all of these goals. People developing analyzers for ...

article

Helper threads via virtual multithreading on an experimental itanium^® 2 processor-based platform

Pages 144–155https://doi.org/10.1145/1037187.1024411

Helper threading is a technology to accelerate a program by exploiting a processor's multithreading capability to run ``assist'' threads. Previous experiments on hyper-threaded processors have demonstrated significant speedups by using helper threads to ...

article

Low-overhead memory leak detection using adaptive statistical profiling

Pages 156–164https://doi.org/10.1145/1037187.1024412

Sampling has been successfully used to identify performance optimization opportunities. We would like to apply similar techniques to check program correctness. Unfortunately, sampling provides poor coverage of infrequently executed code, where bugs ...

SESSION: Memory system analysis and optimization

article

Locality phase prediction

Pages 165–176https://doi.org/10.1145/1037187.1024414

As computer memory hierarchy becomes adaptive, its performance increasingly depends on forecasting the dynamic program locality. This paper presents a method that predicts the locality phases of a program by a combination of locality profiling and run-...

article

Dynamic tracking of page miss ratio curve for memory management

Pages 177–188https://doi.org/10.1145/1037187.1024415

Memory can be efficiently utilized if the dynamic memory demands of applications can be determined and analyzed at run-time. The page miss ratio curve(MRC), i.e. page miss rate vs. memory size curve, is a good performance-directed metric to serve this ...

article

Compiler orchestrated prefetching via speculation and predication

Pages 189–198https://doi.org/10.1145/1037187.1024416

This paper introduces a compiler orchestrated prefetching system as a unified framework geared toward ameliorating the gap between processing speeds and memory access latencies. We focus the scope of the optimization on specific subsets of the program ...

article

Software prefetching for mark-sweep garbage collection: hardware analysis and software redesign

Pages 199–210https://doi.org/10.1145/1037187.1024417

Tracing garbage collectors traverse references from live program variables, transitively tracing out the closure of live objects. Memory accesses incurred during tracing are essentially random: a given object may contain references to any other object. ...

SESSION: Reliability

article

Devirtualizable virtual machines enabling general, single-node, online maintenance

Pages 211–223https://doi.org/10.1145/1037187.1024419

Maintenance is the dominant source of downtime at high availability sites. Unfortunately, the dominant mechanism for reducing this downtime, cluster rolling upgrade, has two shortcomings that have prevented its broad acceptance. First, cluster-style ...

article

Fingerprinting: bounding soft-error detection latency and bandwidth

Pages 224–234https://doi.org/10.1145/1037187.1024420

Recent studies have suggested that the soft-error rate in microprocessor logic will become a reliability concern by 2010. This paper proposes an efficient error detection technique, called fingerprinting, that detects differences in execution across a ...

article

Application-level checkpointing for shared memory programs

Pages 235–247https://doi.org/10.1145/1037187.1024421

Trends in high-performance computing are making it necessary for long-running applications to tolerate hardware faults. The most commonly used approach is checkpoint and restart (CPR) - the state of the computation is saved periodically on disk, and ...

SESSION: Power

article

Formal online methods for voltage/frequency control in multiple clock domain microprocessors

Pages 248–259https://doi.org/10.1145/1037187.1024423

Multiple Clock Domain (MCD) processors are a promising future alternative to today's fully synchronous designs. Dynamic Voltage and Frequency Scaling (DVFS) in an MCD processor has the extra flexibility to adjust the voltage and frequency in each domain ...

article

Heat-and-run: leveraging SMT and CMP to manage power density through the operating system

Pages 260–270https://doi.org/10.1145/1037187.1024424

Power density in high-performance processors continues to increase with technology generations as scaling of current, clock speed, and device density outpaces the downscaling of supply voltage and thermal ability of packages to dissipate heat. Power ...

article

Performance directed energy management for main memory and disks

Pages 271–283https://doi.org/10.1145/1037187.1024425

Much research has been conducted on energy management for memory and disks. Most studies use control algorithms that dynamically transition devices to low power modes after they are idle for a certain threshold period of time. The control algorithms ...

Sections

Save to Binder

Subjects

Comments