Acceleration of Multi-Body Molecular Dynamics With Customized Parallel Dataflow
- Quan Deng,
- Qiang Liu,
- Ming Yuan,
- Xiaohui Duan,
- Lin Gan,
- Jinzhe Yang,
- Wenlai Zhao,
- Zhenxiang Zhang,
- Guiming Wu,
- Wayne Luk,
- Haohuan Fu,
- Guangwen Yang
FPGAs are drawing increasing attention in resolving molecular dynamics (MD) problems, and have already been applied in problems such as two-body potentials, force fields composed of these potentials, etc. Competitive performance is obtained compared with ...
Optimizing I/O Performance Through Effective vCPU Scheduling Interference Management
Virtual machines (VMs) heavily rely on virtual CPUs (vCPUs) scheduling to achieve efficient I/O performance. The vCPU scheduling interference can cause inconsistent scheduling latency and degraded I/O performance, potentially compromising ...
Two-Timescale Joint Optimization of Task Scheduling and Resource Scaling in Multi-Data Center System Based on Multi-Agent Deep Reinforcement Learning
As a new computing paradigm, multi-data center computing enables service providers to deploy their applications close to the users. However, due to the spatio-temporal changes in workloads, it is challenging to coordinate multiple distributed data centers ...
Fair Coflow Scheduling via Controlled Slowdown
- Francesco De Pellegrini,
- Vaibhav Kumar Gupta,
- Rachid El Azouzi,
- Serigne Gueye,
- Cedric Richier,
- Jeremie Leguay
The average coflow completion time (CCT) is the standard performance metric in coflow scheduling. However, standard CCT minimization may introduce unfairness between the data transfer phase of different computing jobs. Thus, while progress guarantees have ...
GeoDeploy: Geo-Distributed Application Deployment Using Benchmarking
- Devki Nandan Jha,
- Yinhao Li,
- Zhenyu Wen,
- Graham Morgan,
- Prem Prakash Jayaraman,
- Maciej Koutny,
- Omer F. Rana,
- Rajiv Ranjan
Geo-distributed web-applications (GWA) can be deployed across multiple geographically separated datacenters to reduce the latency of access for users. Finding a suitable deployment for a GWA is challenging due to the requirement to consider a number of ...
Efficient Schedule Construction for Distributed Execution of Large DNN Models
Increasingly complex and diverse deep neural network (DNN) models necessitate distributing the execution across multiple devices for training and inference tasks, and also require carefully planned schedules for performance. However, existing practices ...
Distributed Task Processing Platform for Infrastructure-Less IoT Networks: A Multi-Dimensional Optimization Approach
With the rapid development of artificial intelligence (AI) and the Internet of Things (IoT), intelligent information services have showcased unprecedented capabilities in acquiring and analysing information. The conventional task processing platforms rely ...
VisionAGILE: A Versatile Domain-Specific Accelerator for Computer Vision Tasks
The emergence of diverse machine learning (ML) models has led to groundbreaking revolutions in computer vision (CV). These ML models include convolutional neural networks (CNNs), graph neural networks (GNNs), and vision transformers (ViTs). However, ...