Top Picks Ignite Innovation
This <italic>IEEE Micro</italic> Special Issue on Top Picks includes 12 outstanding papers selected from those published in 2023 computer architecture conferences. When we were prepping and putting together all the articles in this special issue, another ...
Special Issue on Top Picks From the 2023 Computer Architecture Conferences
It is our pleasure to introduce the <italic>IEEE Micro</italic> Special Issue on Top Picks from the 2023 Computer Architecture Conferences. This special issue includes 12 articles chosen by a selection committee (SC) as being the most significant research ...
Simultaneous and Heterogenous Multithreading: Exploiting Simultaneous and Heterogeneous Parallelism in Accelerator-Rich Architectures
The addition of domain-specific hardware accelerators and general-purpose processors that support vector and scalar models makes modern computers undoubtedly heterogeneous. However, existing programming models and runtime systems target using the most ...
Decoupled Vector Runahead for Prefetching Nested Memory-Access Chains
Decoupled vector runahead (DVR) exploits massive amounts of memory-level parallelism to improve the performance of applications that feature indirect memory accesses by dynamically inferring loop bounds at runtime, recognizing striding loads, and ...
Per-Instruction Cycle Stacks Through Time-Proportional Event Analysis
Understanding what applications spend time on and why is critical for effective performance optimization. Unfortunately, current state-of-the-art performance analysis tools are generally unable to provide this information. The fundamental reason is that ...
End-to-End Cloud Application Cloning With Ditto
The lack of publicly available cloud services has been a recurring problem in architecture and systems. Although open source benchmarks exist, they do not capture the complexity of cloud services. Application cloning is a promising approach, however, ...
Contiguitas: The Pursuit of Physical Memory Contiguity in Data Centers
- Kaiyang Zhao,
- Kaiwen Xue,
- Ziqi Wang,
- Dan Schatzberg,
- Leon Yang,
- Antonis Manousis,
- Johannes Weiner,
- Rik Van Riel,
- Bikash Sharma,
- Chunqiang Tang,
- Dimitrios Skarlatos
The unabating growth of the memory needs of emerging data center applications has exacerbated the scalability bottleneck of virtual memory. However, reducing the overhead of address translation will remain onerous until the physical memory contiguity ...
Mosaic Pages: Big TLB Reach With Small Pages
- Jaehyun Han,
- Krishnan Gosakan,
- William Kuszmaul,
- Ibrahim N. Mubarek,
- Nirjhar Mukherjee,
- Karthik Sriram,
- Guido Tagliavini,
- Evan West,
- Michael A. Bender,
- Abhishek Bhattacharjee,
- Alex Conway,
- Martín Farach-Colton,
- Jayneel Gandhi,
- Rob Johnson,
- Sudarsun Kannan,
- Donald E. Porter
This article introduces mosaic pages, which increase translation lookaside buffer (TLB) reach by compressing multiple, discrete translations into one TLB entry. Mosaic leverages virtual contiguity for locality, but does not use physical contiguity. Mosaic ...
RowPress Vulnerability in Modern DRAM Chips
- Haocong Luo,
- Ataberk Olgun,
- Abdullah Giray Yağlikçi,
- Yahya Can Tuğrul,
- Steve Rhyner,
- Meryem Banu Cavlak,
- Joël Lindegger,
- Mohammad Sadrosadati,
- Onur Mutlu
Memory isolation is a critical property for system reliability, security, and safety. We demonstrate RowPress, a dynamic random-access memory (DRAM) read disturbance phenomenon different from the well-known RowHammer. RowPress induces bitflips by keeping ...
Hardware-Assisted Fault Isolation: Going Beyond the Limits of Software-Based Sandboxing
- Shravan Narayan,
- Tal Garfinkel,
- Mohammadkazem Taram,
- Joey Rudek,
- Daniel Moghimi,
- Evan Johnson,
- Chris Fallin,
- Anjo Vahldiek-Oberwagner,
- Michael LeMay,
- Ravi Sahita,
- Dean Tullsen,
- Deian Stefan
Hardware-assisted fault isolation (HFI) is a minimal extension to current processors that supports secure, flexible, and efficient in-process isolation. HFI addresses the limitations of existing software-based fault isolation (SFI) systems, including ...
Practical Online Reinforcement Learning for Microprocessors With Micro-Armed Bandit
Although online reinforcement learning (RL) has shown promise for microarchitecture decision making, processor vendors are still reluctant to adopt it. There are two main reasons that make RL-based solutions unattractive. First, they have high complexity ...
Programmable Olfactory Computing
Although smell is arguably the most visceral of senses, olfactory computing has been barely explored in the mainstream. We argue that this is a good time to explore olfactory computing as driver applications are emerging, sensors are dramatically better, ...
AuRORA: A Full-Stack Solution for Scalable and Virtualized Accelerator Integration
To meet the increasingly demanding compute requirements of modern workloads, systems on chip (SoCs) must provide an accelerator-rich hardware architecture and software programming interface. However, scalability remains a first-order concern, as ...
Distributed Brain–Computer Interfacing With a Networked Multiaccelerator Architecture
- Raghavendra Pradyumna Pothukuchi,
- Karthik Sriram,
- Michał Gerasimiuk,
- Muhammed Ugur,
- Rajit Manohar,
- Anurag Khandelwal,
- Abhishek Bhattacharjee
SCALO is the first distributed brain–computer interface (BCI) consisting of multiple wireless-networked implants placed on different brain regions. SCALO unlocks new treatment options for debilitating neurological disorders and new research into brainwide ...
Analysis of Historical Patenting Behavior and Patent Characteristics of Computer Architecture Companies—Part XI: Patent Families
In previous parts of this series, I analyzed.
Navigating Applications Development in Generative AI
In its earliest prototypes, GitHub CoPilot stood apart from other tools. It demonstrated a remarkable ability to accelerate the work of intermediate coders by 20% to 40%, mainly with standardized languages like Python. This is a significant productivity ...