Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleNovember 2024JUST ACCEPTED
SILVIA: Automated Superword-Level Parallelism Exploitation via HLS-Specific LLVM Passes for Compute-Intensive FPGA Accelerators
ACM Transactions on Reconfigurable Technology and Systems (TRETS), Just Accepted https://doi.org/10.1145/3705324High-level synthesis (HLS) aims at democratizing custom hardware acceleration with highly abstracted software-like descriptions. However, efficient accelerators still require substantial low-level hardware optimizations, defeating the HLS intent. In the ...
- research-articleNovember 2024
FPGA Accelerated Implementation of 3D Mesh Secret Sharing Based on Symmetric Similarity of Model
ACM Transactions on Reconfigurable Technology and Systems (TRETS), Volume 17, Issue 4Article No.: 60, Pages 1–19https://doi.org/10.1145/3689049Secret sharing is particularly important in the field of information security, which allows for the reconstruction of secret information from secure shares. However, due to the large amount of data and non-integer data type of 3D (three-dimensional) ...
- research-articleNovember 2024
EXPRESS: A Framework for Execution Time Prediction of Concurrent CNNs on Xilinx DPU Accelerator
ACM Transactions on Embedded Computing Systems (TECS), Volume 24, Issue 1Article No.: 11, Pages 1–31https://doi.org/10.1145/3697835Deep learning Processor Unit (DPU) is a highly configurable CNN accelerator that supports a variety of CNNs and can be implemented with multiple instances on the same FPGA. Many applications deploy concurrent execution of different CNNs and in such a ...
- research-articleNovember 2024
Scabbard: An Exploratory Study on Hardware Aware Design Choices of Learning with Rounding-based Key Encapsulation Mechanisms
- Suparna Kundu,
- Quinten Norga,
- Angshuman Karmakar,
- Shreya Gangopadhyay,
- Jose Maria Bermudo Mera,
- Ingrid Verbauwhede
ACM Transactions on Embedded Computing Systems (TECS), Volume 24, Issue 1Article No.: 10, Pages 1–40https://doi.org/10.1145/3696208Recently, the construction of cryptographic schemes based on hard lattice problems has gained immense popularity. Apart from being quantum resistant, lattice-based cryptography allows a wide range of variations in the underlying hard problem. As ...
- research-articleNovember 2024
Codesign of Reactor-Oriented Hardware and Software for Cyber-Physical Systems
ACM Transactions on Reconfigurable Technology and Systems (TRETS), Volume 17, Issue 4Article No.: 55, Pages 1–30https://doi.org/10.1145/3672083Modern cyber-physical systems often make use of heterogeneous systems-on-chip with reconfigurable logic to provide adequate computing power and flexible I/O. However, modeling, verifying, and implementing the computations spanning CPUs and reconfigurable ...
-
- research-articleNovember 2024
Short Paper: Analysis of Vivado implementation strategies regarding side-channel leakage for FPGA-based AES implementations
HASP '24: Proceedings of the International Workshop on Hardware and Architectural Support for Security and Privacy 2024Pages 45–49https://doi.org/10.1145/3696843.3696853Dynamic restructuring of cryptographic implementations has been proposed as a viable countermeasure against power and electro-magnetic-based Side Channel Attacks (SCA). These kind of countermeasures involve shuffling between functionally identical but ...
- research-articleNovember 2024
Efficient deployment of Single Shot Multibox Detector network on FPGAs
AbstractFPGAs, characterized by their low power consumption and swift response, are ideally suited for parallel computations associated with object detection tasks, making them a popular choice for target detection and neural network acceleration. ...
Highlights- Parallel computation boosts speed and efficiency in convolutional layers.
- Integrated parallel processing enhances convolution activation pooling.
- Efficient memory management reduces read/write time for feature layers.
- ...
- research-articleNovember 2024
Design and implementation of deep learning-based object detection and tracking system
AbstractMany human tracking methods by deep learning rely on powerful computing resources. For embedded platforms with limited resources, efficient use of resources is a priority. In this paper, we design an object detection and tracking system based on ...
Highlights- Design an efficient human tracking system using a single-object tracker with feature-based and detection-based CNN method.
- Deploy YOLO v3 on a dual-core DPU partitioned with a reliable design methodology and rapid design framework.
- review-articleOctober 2024
Neural Networks Implementations on FPGA for Biomedical Applications: A Review
AbstractThe use of artificial intelligence in healthcare applications offers significant accuracy and utility for medical practitioners and patients. Deep learning has made a substantial positive impact on the healthcare industry by reducing the use of ...
- research-articleOctober 2024
SAPFIS: a parallel fuzzy inference system for air combat situation assessment
AbstractSituation assessment is an important basis for achieving autonomous decision-making in air combat. The ever-increasing multi-source fusion information perceived by situation assessment system poses a computational challenge to current airborne ...
- abstractOctober 2024
Towards Energy-Efficient Llama2 Architecture on Embedded FPGAs
CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge ManagementPages 5570–5571https://doi.org/10.1145/3627673.3679068Large language models (LLMs) have shown immense potential for applications in information retrieval and knowledge management, but their computational and memory demands pose challenges for resource-constrained devices. In response, this work introduces ...
- review-articleOctober 2024
Automated parallel execution of distributed task graphs with FPGA clusters
- Juan Miguel de Haro Ruiz,
- Carlos Álvarez Martínez,
- Daniel Jiménez-González,
- Xavier Martorell,
- Tomohiro Ueno,
- Kentaro Sano,
- Burkhard Ringlein,
- François Abel,
- Beat Weiss
Future Generation Computer Systems (FGCS), Volume 160, Issue CPages 808–824https://doi.org/10.1016/j.future.2024.06.041AbstractOver the years, Field Programmable Gate Arrays (FPGA) have been gaining popularity in the High Performance Computing (HPC) field, because their reconfigurability enables very fine-grained optimizations with low energy cost. However, the different ...
Highlights- High-level task-based programming model for FPGA clusters with MPI-like communication.
- High performance computing applications can be easily adapted to FPGA clusters.
- Automatic MPI communication inferred by the runtime, users do ...
- research-articleOctober 2024
A lightweight distillation recurrent convolution network on FPGA for real-time video super-resolution
AbstractIn the application of image super-resolution (SR) based on field-programmable gate array (FPGA), depthwise separable convolution is widely utilized. However, existing network designs overly simplify the structures used for deep feature extraction ...
- research-articleOctober 2024
A Parallel Hash Table for Streaming Applications
PACT '24: Proceedings of the 2024 International Conference on Parallel Architectures and Compilation TechniquesPages 297–308https://doi.org/10.1145/3656019.3676951Hash Tables are important data structures for a wide range of data intensive applications in various domains. They offer compact storage for sparse data, but their performance has difficulties to scale with the rapidly increasing volumes of data as they ...
- research-articleOctober 2024
Design and Implementation of Hardware-Software Architecture Based on Hashes for SPHINCS+
ACM Transactions on Reconfigurable Technology and Systems (TRETS), Volume 17, Issue 4Article No.: 54, Pages 1–22https://doi.org/10.1145/3653459Advances in quantum computing have posed a future threat to today’s cryptography. With the advent of these quantum computers, security could be compromised. Therefore, the National Institute of Standards and Technology (NIST) has issued a request for ...
- surveyOctober 2024
A Review of Techniques for Ageing Detection and Monitoring on Embedded Systems
ACM Computing Surveys (CSUR), Volume 57, Issue 1Article No.: 24, Pages 1–34https://doi.org/10.1145/3695247Embedded digital devices are progressively deployed in dependable or safety-critical systems. These devices undergo significant hardware ageing, particularly in harsh environments. This increases their likelihood of failure. It is crucial to understand ...
- research-articleSeptember 2024
Hardware Acceleration for High-Volume Operations of CRYSTALS-Kyber and CRYSTALS-Dilithium
- research-articleSeptember 2024
- research-articleSeptember 2024
A Computation of the Ninth Dedekind Number Using FPGA Supercomputing
- research-articleSeptember 2024
A digital hardware system for real-time biorealistic stimulation on in vitro cardiomyocytes
Artificial Life and Robotics (SPALR), Volume 29, Issue 4Pages 473–478https://doi.org/10.1007/s10015-024-00968-1AbstractEvery year, cardiovascular diseases cause millions of deaths worldwide. These diseases involve complex mechanisms that are difficult to study. To remedy this problem, we propose to develop a heart–brain platform capable of reproducing the ...