Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJune 2024
HYDRA: A Hybrid Resistance Drift Resilient Architecture for Phase Change Memory-Based Neural Network Accelerators
IEEE Transactions on Computers (ITCO), Volume 73, Issue 9Pages 2123–2135https://doi.org/10.1109/TC.2024.3404096In-memory Computing (IMC) using Phase Change Memory (PCM) has proven to be effective for efficient processing of Deep Neural Networks (DNNs). However, with the use of multi-level cell PCM (MLC-PCM) in NVMs-based accelerators, errors due to resistance ...
- research-articleJune 2024
Enabling Reliable Memory-Mapped I/O With Auto-Snapshot for Persistent Memory Systems
IEEE Transactions on Computers (ITCO), Volume 73, Issue 9Pages 2290–2304https://doi.org/10.1109/TC.2024.3416683Persistent memory (PM) is promising to be the next-generation storage device with better I/O performance. Since the traditional I/O path is too lengthy to drive PM featuring low latency and high bandwidth, prior works proposed memory-mapped I/O (MMIO) to ...
- research-articleJune 2024
ISSA: Architecting CNN Accelerators Using Input-Skippable, Set-Associative Computing-in-Memory
IEEE Transactions on Computers (ITCO), Volume 73, Issue 9Pages 2136–2149https://doi.org/10.1109/TC.2024.3404060Among several emerging architectures, computing in memory (CIM), which features in-situ analog computation, is a potential solution to the data movement bottleneck of the Von Neumann architecture for artificial intelligence (AI). Interestingly, more ...
- research-articleMay 2024
SimBU: Self-Similarity-Based Hybrid Binary-Unary Computing for Nonlinear Functions
IEEE Transactions on Computers (ITCO), Volume 73, Issue 9Pages 2192–2205https://doi.org/10.1109/TC.2024.3398512Unary computing is a relatively new method for implementing arbitrary nonlinear functions that uses unpacked thermometer number encoding, enabling much lower hardware costs. In its original form, unary computing provides no trade-off between accuracy and ...
- research-articleJanuary 2024
Achieving DRAM-Like PCM by Trading Off Capacity for Latency
IEEE Transactions on Computers (ITCO), Volume 73, Issue 4Pages 1180–1189https://doi.org/10.1109/TC.2024.3355779Phase Change Memory (PCM) is considered one of the most promising scalable non-volatile main memory alternatives to DRAM. It provides <inline-formula><tex-math notation="LaTeX">$\sim$</tex-math><alternatives><mml:math><mml:mo>∼</mml:mo></mml:math><...
-
- research-articleJanuary 2024
Enhancing Graph Random Walk Acceleration via Efficient Dataflow and Hybrid Memory Architecture
IEEE Transactions on Computers (ITCO), Volume 73, Issue 3Pages 887–901https://doi.org/10.1109/TC.2023.3347674Graph random walk sampling is becoming increasingly important with the widespread popularity of graph applications. It aims to capture the desirable graph properties by launching multiple walkers to collect feature paths. However, previous research ...
- research-articleDecember 2023
Learning the Error Features of Approximate Multipliers for Neural Network Applications
IEEE Transactions on Computers (ITCO), Volume 73, Issue 3Pages 842–856https://doi.org/10.1109/TC.2023.3345163Approximate multipliers (AMs) have widely been investigated to pursue high-performance and energy-efficient hardware designs for error-tolerant applications, such as neural networks (NNs). The computing accuracy of an AM has been evaluated by using ...
- research-articleDecember 2023
Honeycomb: Ordered Key-Value Store Acceleration on an FPGA-Based SmartNIC
- Junyi Liu,
- Aleksandar Dragojević,
- Shane Fleming,
- Antonios Katsarakis,
- Dario Korolija,
- Igor Zablotchi,
- Ho-Cheung Ng,
- Anuj Kalia,
- Miguel Castro
IEEE Transactions on Computers (ITCO), Volume 73, Issue 3Pages 857–871https://doi.org/10.1109/TC.2023.3345173In-memory ordered key-value stores are an important building block in modern distributed applications. We present Honeycomb, a hybrid software-hardware system for accelerating read-dominated workloads on ordered key-value stores that provides ...
- research-articleDecember 2023
CDS: Coupled Data Storage to Enhance Read Performance of 3D TLC NAND Flash Memory
IEEE Transactions on Computers (ITCO), Volume 73, Issue 3Pages 694–707https://doi.org/10.1109/TC.2023.3338474Due to the strong demand of massive storage capacity, the density of flash memory has been improved in terms of technology node scaling, multi-bit per cell technique, and 3D stacking. However, these techniques also degrade read performance and ...
- research-articleDecember 2023
Wrong-Path-Aware Entangling Instruction Prefetcher
IEEE Transactions on Computers (ITCO), Volume 73, Issue 2Pages 548–559https://doi.org/10.1109/TC.2023.3337308Instruction prefetching is instrumental for guaranteeing a high flow of instructions through the processor front end for applications whose working set does not fit in the lower-level caches. Examples of such applications are server workloads, whose ...
- research-articleNovember 2023
Stochastic Circuits for Computing Weighted Ratio With Applications to Multiclass Bayesian Inference Machine
IEEE Transactions on Computers (ITCO), Volume 73, Issue 2Pages 621–630https://doi.org/10.1109/TC.2023.3329998Bayesian inference is one method of statistical inference in machine learning. It predicts the probability that a given test belongs to a certain class and is widely used in various applications such as medical diagnosis, spam classification and fraud ...
- research-articleNovember 2023
A High-Performance, Energy-Efficient Modular DMA Engine Architecture
- Thomas Benz,
- Michael Rogenmoser,
- Paul Scheffler,
- Samuel Riedel,
- Alessandro Ottaviano,
- Andreas Kurth,
- Torsten Hoefler,
- Luca Benini
IEEE Transactions on Computers (ITCO), Volume 73, Issue 1Pages 263–277https://doi.org/10.1109/TC.2023.3329930Data transfers are essential in today's computing systems as latency and complex memory access patterns are increasingly challenging to manage. Direct memory access engines (DMAES) are critically needed to transfer data independently of the ...
- research-articleOctober 2023
An Efficient Deep Reinforcement Learning-Based Automatic Cache Replacement Policy in Cloud Block Storage Systems
IEEE Transactions on Computers (ITCO), Volume 73, Issue 1Pages 164–177https://doi.org/10.1109/TC.2023.3325625With the popularity of cloud services, cloud block storage (CBS) systems have been widely deployed by cloud providers. Cloud cache plays a vital role in maintaining high and stable performance in cloud block storage systems. In the past few decades, much ...
- research-articleSeptember 2023
Split-Radix Based Compact Hardware Architecture for CRYSTALS-Kyber
IEEE Transactions on Computers (ITCO), Volume 73, Issue 1Pages 97–108https://doi.org/10.1109/TC.2023.3320040Facing the threat of large-scale quantum computers to traditional public-key cryptography, the National Institute of Standards and Technology has conducted Post-Quantum Cryptography algorithms evaluation for a long time, and CRYSTALS-Kyber has been ...
- research-articleAugust 2023
MemPool: A Scalable Manycore Architecture With a Low-Latency Shared L1 Memory
IEEE Transactions on Computers (ITCO), Volume 72, Issue 12Pages 3561–3575https://doi.org/10.1109/TC.2023.3307796Shared L1 memory clusters are a common architectural pattern (e.g., in GPGPUs) for building efficient and flexible multi-processing-element (PE) engines. However, it is a common belief that these tightly-coupled clusters would not scale beyond a few tens ...
- research-articleAugust 2023
Unified Digit Selection for Radix-4 Recurrence Division and Square Root
IEEE Transactions on Computers (ITCO), Volume 73, Issue 1Pages 292–300https://doi.org/10.1109/TC.2023.3305760Division and square root are fundamental operations required by most computer systems. They are commonly implemented in hardware using radix-4 recurrence, which produces a 2-bit result digit on each step. Unified digit selection logic chooses the next ...
- research-articleAugust 2023
An Area-Efficient In-Memory Implementation Method of Arbitrary Boolean Function Based on SRAM Array
IEEE Transactions on Computers (ITCO), Volume 72, Issue 12Pages 3416–3430https://doi.org/10.1109/TC.2023.3301156In-memory computing is an emerging computing paradigm to breakthrough the von-Neumann bottleneck. The SRAM based in-memory computing (SRAM-IMC) attracts great concerns from industries and academia, because the SRAM is technology compatible with the widely-...
- research-articleAugust 2023
An Edge-Side Real-Time Video Analytics System With Dual Computing Resource Control
IEEE Transactions on Computers (ITCO), Volume 72, Issue 12Pages 3399–3415https://doi.org/10.1109/TC.2023.3301136Video analytics systems conduct video preprocessing to filter out unnecessary frames and model inference using appropriately selected neural networks for high analytics speed. Video preprocessing is instruction-intensive computing (IIC) executed by CPU, ...
- research-articleJuly 2023
HPKA: A High-Performance CRYSTALS-Kyber Accelerator Exploring Efficient Pipelining
IEEE Transactions on Computers (ITCO), Volume 72, Issue 12Pages 3340–3353https://doi.org/10.1109/TC.2023.3296899CRYSTALS-Kyber (Kyber) was recently chosen as the first quantum resistant Key Encapsulation Mechanism (KEM) scheme for standardisation, after three rounds of the National Institute of Standards and Technology (NIST) initiated PQC competition which begin ...
- research-articleJuly 2023
ERA-BS: Boosting the Efficiency of ReRAM-Based PIM Accelerator With Fine-Grained Bit-Level Sparsity
IEEE Transactions on Computers (ITCO), Volume 73, Issue 9Pages 2320–2334https://doi.org/10.1109/TC.2023.3290869Resistive Random-Access-Memory (ReRAM) crossbar is one of the most promising neural network accelerators, thanks to its in-memory and in-situ analog computing abilities for Matrix Multiplication-and-Accumulations (MACs). The key limitations are: 1) the ...