Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- ArticleSeptember 2024
Detecting as Labeling: Rethinking LiDAR-Camera Fusion in 3D Object Detection
Abstract3D object Detection with LiDAR-camera encounters overfitting in algorithm development derived from violating some fundamental rules. We refer to the data annotation in dataset construction for theoretical optimization and argue that the regression ...
- research-articleMarch 2024
XVDPU: A High-Performance CNN Accelerator on the Versal Platform Powered by the AI Engine
- Xijie Jia,
- Yu Zhang,
- Guangdong Liu,
- Xinlin Yang,
- Tianyu Zhang,
- Jia Zheng,
- Dongdong Xu,
- Zhuohuan Liu,
- Mengke Liu,
- Xiaoyang Yan,
- Hong Wang,
- Rongzhang Zheng,
- Li Wang,
- Dong Li,
- Satyaprakash Pareek,
- Jian Weng,
- Lu Tian,
- Dongliang Xie,
- Hong Luo,
- Yi Shan
ACM Transactions on Reconfigurable Technology and Systems (TRETS), Volume 17, Issue 2Article No.: 20, Pages 1–24https://doi.org/10.1145/3617836Today, convolutional neural networks (CNNs) are widely used in computer vision applications. However, the trends of higher accuracy and higher resolution generate larger networks. The requirements of computation or I/O are the key bottlenecks. In this ...
- research-articleSeptember 2023
Solver-In-The-Loop Cluster Resource Management for Database-as-a-Service
Proceedings of the VLDB Endowment (PVLDB), Volume 16, Issue 13Pages 4254–4267https://doi.org/10.14778/3625054.3625062In Database-as-a-Service (DBaaS) clusters, resource management is a complex optimization problem that assigns tenants to nodes, subject to various constraints and objectives. Tenants share resources within a node, however, their resource demands can ...
- research-articleSeptember 2023
Flexible Resource Allocation for Relational Database-as-a-Service
- Pankaj Arora,
- Surajit Chaudhuri,
- Sudipto Das,
- Junfeng Dong,
- Cyril George,
- Ajay Kalhan,
- Arnd Christian König,
- Willis Lang,
- Changsong Li,
- Feng Li,
- Jiaqi Liu,
- Lukas M. Maas,
- Akshay Mata,
- Ishai Menache,
- Justin Moeller,
- Vivek Narasayya,
- Matthaios Olma,
- Morgan Oslake,
- Elnaz Rezai,
- Yi Shan,
- Manoj Syamala,
- Shize Xu,
- Vasileios Zois
Proceedings of the VLDB Endowment (PVLDB), Volume 16, Issue 13Pages 4202–4215https://doi.org/10.14778/3625054.3625058Oversubscription is an essential cost management strategy for cloud database providers, and its importance is magnified by the emerging paradigm of serverless databases. In contrast to general purpose techniques used for oversubscription in hypervisors, ...
- research-articleJuly 2022
Tenant placement in over-subscribed database-as-a-service clusters
- Arnd Christian König,
- Yi Shan,
- Tobias Ziegler,
- Aarati Kakaraparthy,
- Willis Lang,
- Justin Moeller,
- Ajay Kalhan,
- Vivek Narasayya
Proceedings of the VLDB Endowment (PVLDB), Volume 15, Issue 11Pages 2559–2571https://doi.org/10.14778/3551793.3551814Relational cloud Database-as-a-Service offerings run on multi-tenant infrastructure consisting of clusters of nodes, with each node hosting multiple tenant databases. Such clusters may be over-subscribed to increase resource utilization and improve ...
-
- ArticleNovember 2020
MTNAS: Search Multi-task Networks for Autonomous Driving
AbstractMulti-task learning (MTL) aims to learn shared representations from multiple tasks simultaneously, which has yielded outstanding performance in widespread applications of computer vision. However, existing multi-task approaches often demand manual ...
- ArticleAugust 2020
ProgressFace: Scale-Aware Progressive Learning for Face Detection
AbstractScale variation stands out as one of key challenges in face detection. Recent attempts have been made to cope with this issue by incorporating image/feature pyramids or adjusting anchor sampling/matching strategies. In this work, we propose a ...
- posterFebruary 2020
LPAC: A Low-Precision Accelerator for CNN on FPGAs
- Tianyu Zhang,
- Tiantian Han,
- Lu Tian,
- Yi Li,
- Xijie Jia,
- Guangdong Liu,
- Pingbo An,
- Yingran Tan,
- Lingzhi Sui,
- Shaoxie Fang,
- Dongliang Xie,
- Michaela Blott,
- Yi Shan
FPGA '20: Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysPage 316https://doi.org/10.1145/3373087.3375343Low bit quantization of neural network is required on edge devices to achieve lower power consumption and higher performance. 8bit or binary network either consumes a lot of resources or has accuracy degradation. Thus, a full-process hardware-friendly ...
- research-articleJanuary 2020
Black Box Search Space Profiling for Accelerator-Aware Neural Architecture Search
ASPDAC '20: Proceedings of the 25th Asia and South Pacific Design Automation ConferencePages 518–523https://doi.org/10.1109/ASP-DAC47756.2020.9045179Neural Architecture Search (NAS) is a promising approach to discover good neural network architectures for given applications. Among the three basic components in a NAS system (search space, search strategy, and evaluation), prior work mainly focused on ...
- posterFebruary 2019
DNNVM: End-to-End Compiler Leveraging Operation Fusion on FPGA-based CNN Accelerators
- Yu Xing,
- Shuang Liang,
- Lingzhi Sui,
- Zhen Zhang,
- Jiantao Qiu,
- Xijie Jia,
- Xin Liu,
- Yushun Wang,
- Yi Shan,
- Yu Wang
FPGA '19: Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysPages 187–188https://doi.org/10.1145/3289602.3293972In recent years, Convolutional Neural Network(CNN) is becoming the state-of-the-art method in a wide range of Artificial Intelligence(AI) domains. The increasingly large and complex CNN models are both computation bound and I/O bound. FPGA-based ...
- posterFebruary 2019
A Fine-Grained Sparse Accelerator for Multi-Precision DNN
- Shulin Zeng,
- Yujun Lin,
- Shuang Liang,
- Junlong Kang,
- Dongliang Xie,
- Yi Shan,
- Song Han,
- Yu Wang,
- Huazhong Yang
FPGA '19: Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysPage 185https://doi.org/10.1145/3289602.3293964Neural Networks (NNs) have made a significant breakthrough in many fields, while they also pose a great challenge to hardware platforms since the state-of-the-art neural networks are both communicational- and computational-intensive. Researchers ...
- research-articleMarch 2017
Fast HEVC intra coding algorithm based on machine learning and Laplacian Transparent Composite Model
2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)Pages 2642–2646https://doi.org/10.1109/ICASSP.2017.7952635Compared with H.264, High Efficient Video Coding (HEVC) improves the coding efficiency by 50% at the price of significant increase in encoding time, due to Rate Distortion Optimization (RDO) on large variations of block sizes and prediction modes. In this ...
- research-articleDecember 2015
RRAM-Based Analog Approximate Computing
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCADICS), Volume 34, Issue 12Pages 1905–1917https://doi.org/10.1109/TCAD.2015.2445741Approximate computing is a promising design paradigm for better performance and power efficiency. In this paper, we propose a power efficient framework for analog approximate computing with the emerging metal-oxide resistive switching random-access memory ...
- ArticleJune 2015
Scalable Query Optimization for Efficient Data Processing Using MapReduce
BIGDATACONGRESS '15: Proceedings of the 2015 IEEE International Congress on Big DataPages 649–652https://doi.org/10.1109/BigDataCongress.2015.100MapReduce is widely acknowledged by both industry and academia as an effective programming model for query processing on big data. It is crucial to design an optimizer which finds the most efficient way to execute an SQL query using MapReduce. However, ...
- research-articleMay 2014
Enabling FPGAs in the cloud
CF '14: Proceedings of the 11th ACM Conference on Computing FrontiersArticle No.: 3, Pages 1–10https://doi.org/10.1145/2597917.2597929Cloud computing is becoming a major trend for delivering and accessing infrastructure on demand via the network. Meanwhile, the usage of FPGAs (Field Programmable Gate Arrays) for computation acceleration has made significant inroads into multiple ...
- research-articleApril 2014
Hardware Acceleration for an Accurate Stereo Vision System Using Mini-Census Adaptive Support Region
ACM Transactions on Embedded Computing Systems (TECS), Volume 13, Issue 4sArticle No.: 132, Pages 1–24https://doi.org/10.1145/2584659Domain of stereo vision is highly important in the fields of autonomous cars, video tolling, robotics, and aerial surveys. The specific feature of this domain is that we should handle not only the pixel-by-pixel 2D processing in one image but also the ...
- research-articleSeptember 2013
Memristor-based approximated computation
ISLPED '13: Proceedings of the 2013 International Symposium on Low Power Electronics and DesignPages 242–247The cessation of Moore's Law has limited further improvements in power efficiency. In recent years, the physical realization of the memristor has demonstrated a promising solution to ultra-integrated hardware realization of neural networks, which can be ...
- ArticleMarch 2011
FPGA accelerated parallel sparse matrix factorization for circuit simulations
Sparse matrix factorization is a critical step for the circuit simulation problem, since it is time consuming and computed repeatedly in the flow of circuit simulation. To accelerate the factorization of sparse matrices, a parallel CPU+FPGA based ...
- ArticleDecember 2010
Making Human Connectome Faster: GPU Acceleration of Brain Network Analysis
ICPADS '10: Proceedings of the 2010 IEEE 16th International Conference on Parallel and Distributed SystemsPages 593–600https://doi.org/10.1109/ICPADS.2010.105The research on complex Brain Networks plays a vital role in understanding the connectivity patterns of the human brain and disease-related alterations. Recent studies have suggested a noninvasive way to model and analyze human brain networks by using ...
- ArticleOctober 2010
Popular or Personal: Access Patterns of User Generated Content
ICNDC '10: Proceedings of the 2010 First International Conference on Networking and Distributed ComputingPages 367–371https://doi.org/10.1109/ICNDC.2010.76It is generally believed that Zipf’s law is well applied for Web objects. However, as OSNs prevail and Internet begins to concentrate more and more on individuals, many researchers report that request frequency of Web objects deviates from Zipf’s law ...