Nothing Special   »   [go: up one dir, main page]

skip to main content
Volume 22, Issue 6November 2023
Reflects downloads up to 20 Nov 2024Bibliometrics
Skip Table Of Content Section
SECTION: Special Issue on AI Acceleration on FPGAS
introduction
Free
Special Issue: “AI Acceleration on FPGAs”
Article No.: 89, Pages 1–3https://doi.org/10.1145/3626323
research-article
High-performance Reconfigurable DNN Accelerator on a Bandwidth-limited Embedded System
Article No.: 90, Pages 1–20https://doi.org/10.1145/3530818

Deep convolutional neural networks (DNNs) have been widely used in many applications, particularly in machine vision. It is challenging to accelerate DNNs on embedded systems because real-world machine vision applications should reserve a lot of external ...

research-article
FD-CNN: A Frequency-Domain FPGA Acceleration Scheme for CNN-Based Image-Processing Applications
Article No.: 91, Pages 1–30https://doi.org/10.1145/3559105

In the emerging edge-computing scenarios, FPGAs have been widely adopted to accelerate convolutional neural network (CNN)–based image-processing applications, such as image classification, object detection, and image segmentation, and so on. A standard ...

research-article
Open Access
An Intermediate-Centric Dataflow for Transposed Convolution Acceleration on FPGA
Article No.: 92, Pages 1–22https://doi.org/10.1145/3561053

Transposed convolution has been prevailing in convolutional neural networks (CNNs), playing an important role in multiple scenarios such as image segmentation and back-propagation process of training CNNs. This mainly benefits from the ability to up-...

research-article
Accelerating Attention Mechanism on FPGAs based on Efficient Reconfigurable Systolic Array
Article No.: 93, Pages 1–22https://doi.org/10.1145/3549937

Transformer model architectures have recently received great interest in natural language, machine translation, and computer vision, where attention mechanisms are their building blocks. However, the attention mechanism is expensive because of its ...

research-article
On the RTL Implementation of FINN Matrix Vector Unit
Article No.: 94, Pages 1–27https://doi.org/10.1145/3547141

Field-programmable gate array (FPGA)–based accelerators are becoming increasingly popular for deep neural network (DNN) inference due to their ability to scale performance with increasing degrees of specialization with dataflow architectures or custom ...

research-article
ACDSE: A Design Space Exploration Method for CNN Accelerator based on Adaptive Compression Mechanism
Article No.: 95, Pages 1–26https://doi.org/10.1145/3545177

Customized accelerators for Convolutional Neural Network (CNN) can achieve better energy efficiency than general computing platforms. However, the design of a high-performance accelerator should take into account a variety of parameters and physical ...

research-article
Open Access
TH-iSSD: Design and Implementation of a Generic and Reconfigurable Near-Data Processing Framework
Article No.: 96, Pages 1–23https://doi.org/10.1145/3563456

We present the design and implementation of TH-iSSD, a near-data processing framework to address the data movement problem. TH-iSSD does not pose any restriction to the hardware selection and is highly reconfigurable—its core components, such as the on-...

SECTION: Regular Papers
research-article
RegKey: A Register-based Implementation of ECC Signature Algorithms Against One-shot Memory Disclosure
Article No.: 97, Pages 1–22https://doi.org/10.1145/3604805

To ensure the security of cryptographic algorithm implementations, several cryptographic key protection schemes have been proposed to prevent various memory disclosure attacks. Among them, the register-based solutions do not rely on special hardware ...

research-article
SensiX++: Bringing MLOps and Multi-tenant Model Serving to Sensory Edge Devices
Article No.: 98, Pages 1–27https://doi.org/10.1145/3617507

We present SensiX++, a multi-tenant runtime for adaptive model execution with integrated MLOps on edge devices, e.g., a camera, a microphone, or IoT sensors. SensiX++ operates on two fundamental principles: highly modular componentisation to externalise ...

research-article
Open Access
Scheduling Dynamic Software Updates in Mobile Robots
Article No.: 99, Pages 1–27https://doi.org/10.1145/3623676

We present NeRTA (Next Release Time Analysis), a technique to enable dynamic software updates for low-level control software of mobile robots. Dynamic software updates enable software correction and evolution during system operation. In mobile robotics, ...

research-article
Open Access
Online Distributed Schedule Randomization to Mitigate Timing Attacks in Industrial Control Systems
Article No.: 100, Pages 1–39https://doi.org/10.1145/3624584

Industrial control systems (ICSs) consist of a large number of control applications that are associated with periodic real-time flows with hard deadlines. To facilitate large-scale integration, remote control, and co-ordination, wireless sensor and ...

research-article
SG-Float: Achieving Memory Access and Computing Power Reduction Using Self-Gating Float in CNNs
Article No.: 101, Pages 1–22https://doi.org/10.1145/3624582

Convolutional neural networks (CNNs) are essential for advancing the field of artificial intelligence. However, since these networks are highly demanding in terms of memory and computation, implementing CNNs can be challenging. To make CNNs more ...

research-article
Energy-Efficient Communications for Improving Timely Progress of Intermittent-Powered BLE Devices
Article No.: 102, Pages 1–20https://doi.org/10.1145/3626197

Battery-less devices offer potential solutions for maintaining sustainable Internet of Things (IoT) networks. However, limited energy harvesting capacity can lead to power failures, limiting the system’s quality of service (QoS). To improve timely task ...

research-article
A Comprehensive Model for Efficient Design Space Exploration of Imprecise Computational Blocks
Article No.: 103, Pages 1–20https://doi.org/10.1145/3625555

After almost a decade of research, development of more efficient imprecise computational blocks is still a major concern in imprecise computing domain. There are many instances of the introduced imprecise components of different types, while their main ...

research-article
Dynamic Thermal Management of 3D Memory through Rotating Low Power States and Partial Channel Closure
Article No.: 104, Pages 1–27https://doi.org/10.1145/3624581

Modern high-performance and high-bandwidth three-dimensional (3D) memories are characterized by frequent heating. Prior art suggests turning off hot channels and migrating data to the background DDR memory, incurring significant performance and energy ...

research-article
Open Access
Enabling Binary Neural Network Training on the Edge
Article No.: 105, Pages 1–19https://doi.org/10.1145/3626100

The ever-growing computational demands of increasingly complex machine learning models frequently necessitate the use of powerful cloud-based infrastructure for their training. Binary neural networks are known to be promising candidates for on-device ...

research-article
Design and Analysis of High Performance Heterogeneous Block-based Approximate Adders
Article No.: 106, Pages 1–32https://doi.org/10.1145/3625686

Approximate computing is an emerging paradigm to improve the power and performance efficiency of error-resilient applications. As adders are one of the key components in almost all processing systems, a significant amount of research has been carried out ...

Subjects

Comments

Please enable JavaScript to view thecomments powered by Disqus.