Nothing Special   »   [go: up one dir, main page]

skip to main content
Reflects downloads up to 27 Nov 2024Bibliometrics
Skip Table Of Content Section
research-article
HiHGNN: Accelerating HGNNs Through Parallelism and Data Reusability Exploitation

Heterogeneous graph neural networks (HGNNs) have emerged as powerful algorithms for processing heterogeneous graphs (HetGs), widely used in many critical fields. To capture both structural and semantic information in HetGs, HGNNs first aggregate the ...

research-article
A 3D Hybrid Optical-Electrical NoC Using Novel Mapping Strategy Based DCNN Dataflow Acceleration

A large number of multiply-accumulate operations and memory accesses required in deep convolutional neural networks (DCNN) leads to high latency and energy consumption (EC), that hinder their further applications. Dataflow-based acceleration schemes ...

research-article
Synchronize Only the Immature Parameters: Communication-Efficient Federated Learning By Freezing Parameters Adaptively

Federated learning allows edge devices to collaboratively train a global model without sharing their local private data. Yet, with limited network bandwidth at the edge, communication often becomes a severe bottleneck. In this article, we find that it is ...

research-article
FastTuning: Enabling Fast and Efficient Hyper-Parameter Tuning With Partitioning and Parallelism of Search Space

Hyper-parameter tuning (HPT) for deep learning (DL) models is prohibitively expensive. Sequential model-based optimization (SMBO) emerges as the state-of-the-art (SOTA) approach to automatically optimize HPT performance due to its heuristic advantages. ...

research-article
FedREM: Guided Federated Learning in the Presence of Dynamic Device Unpredictability

Federated learning (FL) is a promising distributed machine learning scheme where multiple clients collaborate by sharing a common learning model while maintaining their private data locally. It can be applied to a lot of applications, e.g., training an ...

research-article
Fed-RAC: Resource-Aware Clustering for Tackling Heterogeneity of Participants in Federated Learning

Federated Learning is a training framework that enables multiple participants to collaboratively train a shared model while preserving data privacy. The heterogeneity of devices and networking resources of the participants delay the training and ...

research-article
Graph-Centric Performance Analysis for Large-Scale Parallel Applications

Performance analysis is essential for understanding the performance behaviors of parallel programs and detecting performance bottlenecks. Whereas, complex interconnections across several types of performance bugs, as well as inter-process communications ...

research-article
Spiking Neural P Systems With Microglia

Spiking neural P systems (SNP systems), one of the parallel and distributed computing models with biological interpretability, have been a hot research topic in bio-inspired computational models in recent years. To improve the stability of the models, ...

research-article
Bayesian-Driven Automated Scaling in Stream Computing With Multiple QoS Targets

Stream processing systems commonly work with auto-scaling to ensure resource efficiency and quality of service (QoS). Existing auto-scaling solutions lack accuracy in resource allocation because they rely on static QoS-resource models that fail to account ...

research-article
Availability-Aware Revenue-Effective Application Deployment in Multi-Access Edge Computing

Multi-access edge computing (MEC) has emerged as a promising computing paradigm to push computing resources and services to the network edge. It allows applications/services to be deployed on edge servers for provisioning low-latency services to nearby ...

research-article
AdaptChain: Adaptive Data Sharing and Synchronization for NFV Systems on Heterogeneous Architectures

In a Network Function Virtualization (NFV) system, network functions (NFs) are implemented on general-purpose hardware, including CPU, GPU, and FPGA. Studies have shown that there is no one-size-fits-all processor, as each processor demonstrates ...

research-article
CREPE: Concurrent Reverse-Modulo-Scheduling and Placement for CGRAs

Coarse-Grained Reconfigurable Array (CGRA) architectures are popular as high-performance and energy-efficient computing devices. Compute-intensive loop constructs of complex applications are mapped onto CGRAs by modulo-scheduling the innermost loop ...

research-article
Open Access
Rollback-Free Recovery for a High Performance Dense Linear Solver With Reduced Memory Footprint

The scale of nowadays High Performance Computing (HPC) systems is the key element that determines the achievement of impressive performance, as well as the reason for their relatively limited reliability. Over the last decade, specific areas of the High ...

research-article
Adaptive Neural Control for a Network of Parabolic PDEs With Event-Triggered Mechanism

This paper investigates the finite-time consensus problem for nonlinear parabolic networks by designing a new tracking controller. For undirected topology, the newly designed controller allows to optimize the consensus time by adjusting the parameter <...

Comments

Please enable JavaScript to view thecomments powered by Disqus.