default search action
ACM Transactions on Architecture and Code Optimization, Volume 21
Volume 21, Number 1, March 2024
- Longfei Luo, Dingcui Yu, Yina Lv, Liang Shi:
Critical Data Backup with Hybrid Flash-Based Consumer Devices. 1:1-1:23 - Peng Chen, Hui Chen, Weichen Liu, Linbo Long, Wanli Chang, Nan Guan:
DAG-Order: An Order-Based Dynamic DAG Scheduling for Real-Time Networks-on-Chip. 2:1-2:24 - Zhang Jiang, Ying Chen, Xiaoli Gong, Jin Zhang, Wenwen Wang, Pen-Chung Yew:
JiuJITsu: Removing Gadgets with Safe Register Allocation for JIT Code Generation. 3:1-3:26 - Hayfa Tayeb, Ludovic Paillat, Bérenger Bramas:
Autovesk: Automatic Vectorized Code Generation from Unstructured Static Kernels Using Graph Transformations. 4:1-4:25 - Xueying Wang, Guangli Li, Zhen Jia, Xiaobing Feng, Yida Wang:
Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs. 5:1-5:26 - Hao Fan, Yiliang Ye, Shadi Ibrahim, Zhuo Huang, Xingru Li, Weibin Xue, Song Wu, Chen Yu, Xuanhua Shi, Hai Jin:
QoS-pro: A QoS-enhanced Transaction Processing Framework for Shared SSDs. 6:1-6:25 - Yunping Zhao, Sheng Ma, Hengzhu Liu, Libo Huang, Yi Dai:
SAC: An Ultra-Efficient Spin-based Architecture for Compressed DNNs. 7:1-7:26 - Tong-Yu Liu, Jianmei Guo, Bo Huang:
Efficient Cross-platform Multiplexing of Hardware Performance Counters via Adaptive Grouping. 8:1-8:26 - Lei Liu, Xinglei Dou:
QuCloud+: A Holistic Qubit Mapping Scheme for Single/Multi-programming on 2D/3D NISQ Quantum Computers. 9:1-9:27 - Lingxi Wu, Minxuan Zhou, Weihong Xu, Ashish Venkat, Tajana Rosing, Kevin Skadron:
Abakus: Accelerating k-mer Counting with Storage Technology. 10:1-10:26 - Seokwon Kang, Jongbin Kim, Gyeongyong Lee, Jeongmyung Lee, Jiwon Seo, Hyungsoo Jung, Yong Ho Song, Yongjun Park:
ISP Agent: A Generalized In-storage-processing Workload Offloading Framework by Providing Multiple Optimization Opportunities. 11:1-11:24 - Prasoon Mishra, V. Krishna Nandivada:
COWS for High Performance: Cost Aware Work Stealing for Irregular Parallel Loop. 12:1-12:26 - Joongun Park, Seunghyo Kang, Sanghyeon Lee, Taehoon Kim, Jongse Park, Youngjin Kwon, Jaehyuk Huh:
Hardware-hardened Sandbox Enclaves for Trusted Serverless Computing. 13:1-13:25 - Tyler N. Allen, Bennett Cooper, Rong Ge:
Fine-grain Quantitative Analysis of Demand Paging in Unified Virtual Memory. 14:1-14:24 - Zhonghua Wang, Yixing Guo, Kai Lu, Jiguang Wan, Daohui Wang, Ting Yao, Huatao Wu:
Rcmp: Reconstructing RDMA-Based Memory Disaggregation via CXL. 15:1-15:26 - Linbo Long, Shuiyong He, Jingcheng Shen, Renping Liu, Zhenhua Tan, Congming Gao, Duo Liu, Kan Zhong, Yi Jiang:
WA-Zone: Wear-Aware Zone Management Optimization for LSM-Tree on ZNS SSDs. 16:1-16:23 - Zhihua Fan, Wenming Li, Zhen Wang, Yu Yang, Xiaochun Ye, Dongrui Fan, Ninghui Sun, Xuejun An:
Improving Utilization of Dataflow Unit for Multi-Batch Processing. 17:1-17:26 - Dunbo Zhang, Qingjie Lang, Ruoxi Wang, Li Shen:
Extension VM: Interleaved Data Layout in Vector Memory. 18:1-18:23 - Can Firtina, Kamlesh R. Pillai, Gurpreet S. Kalsi, Bharathwaj Suresh, Damla Senol Cali, Jeremie S. Kim, Taha Shahroodi, Meryem Banu Cavlak, Joël Lindegger, Mohammed Alser, Juan Gómez-Luna, Sreenivas Subramoney, Onur Mutlu:
ApHMM: Accelerating Profile Hidden Markov Models for Fast and Energy-efficient Genome Analysis. 19:1-19:29 - Khalid Ahmad, Cris Cecka, Michael Garland, Mary W. Hall:
Exploring Data Layout for Sparse Tensor Times Dense Matrix on GPUs. 20:1-20:20
Volume 21, Number 2, June 2024
- Chandra Sekhar Mummidi, Victor da Cruz Ferreira, Sudarshan Srinivasan, Sandip Kundu:
Highly Efficient Self-checking Matrix Multiplication on Tiled AMX Accelerators. 21 - Zhonghua Wang, Chen Ding, Fengguang Song, Kai Lu, Jiguang Wan, Zhihu Tan, Changsheng Xie, Guokuan Li:
WIPE: A Write-Optimized Learned Index for Persistent Memory. 22 - Gino A. Chacon, Charles Williams, Johann Knechtel, Ozgur Sinanoglu, Paul V. Gratz, Vassos Soteriou:
Coherence Attacks and Countermeasures in Interposer-based Chiplet Systems. 23 - Yan Wei, Xingjun Zhang:
A Concise Concurrent B+-Tree for Persistent Memory. 24 - Fareed Qararyah, Muhammad Waqar Azhar, Pedro Trancoso:
An Efficient Hybrid Deep Learning Accelerator for Compact and Heterogeneous CNNs. 25 - Fernando Fernandes dos Santos, Luigi Carro, Flavio Vella, Paolo Rech:
Assessing the Impact of Compiler Optimizations on GPUs Reliability. 26 - Valentin Isaac-Chassande, Adrian Evans, Yves Durand, Frédéric Rousseau:
Dedicated Hardware Accelerators for Processing of Sparse Matrices and Vectors: A Survey. 27 - Benyi Xie, Yue Yan, Chenghao Yan, Sicheng Tao, Zhuangzhuang Zhang, Xinyu Li, Yanzhi Lan, Xiang Wu, Tianyi Liu, Tingting Zhang, Fuxin Zhang:
An Instruction Inflation Analyzing Framework for Dynamic Binary Translators. 28 - Samuel Rac, Mats Brorsson:
Cost-aware Service Placement and Scheduling in the Edge-Cloud Continuum. 29 - Feng Xue, Chenji Han, Xinyu Li, Junliang Wu, Tingting Zhang, Tianyi Liu, Yifan Hao, Zidong Du, Qi Guo, Fuxin Zhang:
Tyche: An Efficient and General Prefetcher for Indirect Memory Accesses. 30 - Kunpeng Xie, Ye Lu, Xinyu He, Dezhi Yi, Huijuan Dong, Yao Chen:
Winols: A Large-Tiling Sparse Winograd CNN Accelerator on FPGAs. 31 - Ke Liu, Kan Wu, Hua Wang, Ke Zhou, Peng Wang, Ji Zhang, Cong Li:
SLAP: Segmented Reuse-Time-Label Based Admission Policy for Content Delivery Network Caching. 32 - Panagiotis Miliadis, Dimitris Theodoropoulos, Dionisios N. Pnevmatikatos, Nectarios Koziris:
Architectural Support for Sharing, Isolating and Virtualizing FPGA Resources. 33 - Haitao Du, Yuhan Qin, Song Chen, Yi Kang:
FASA-DRAM: Reducing DRAM Latency with Destructive Activation and Delayed Restoration. 34 - Michael Canesche, Vanderson Martins do Rosário, Edson Borin, Fernando Magno Quintão Pereira:
The Droplet Search Algorithm for Kernel Scheduling. 35 - Asmita Pal, Keerthana Desai, Rahul Chatterjee, Joshua San Miguel:
Camouflage: Utility-Aware Obfuscation for Accurate Simulation of Sensitive Program Traces. 36 - Chengying Huan, Yongchao Liu, Heng Zhang, Shuaiwen Song, Santosh Pandey, Shiyang Chen, Xiangfei Fang, Yue Jin, Baptiste Lepers, Yanjun Wu, Hang Liu:
TEA+: A Novel Temporal Graph Random Walk Engine with Hybrid Storage Architecture. 37 - Soojin Hwang, Daehyeon Baek, Jongse Park, Jaehyuk Huh:
Cerberus: Triple Mode Acceleration of Sparse Matrix and Vector Multiplication. 38 - Siddhartha Raman Sundara Raman, Lizy Kurian John, Jaydeep P. Kulkarni:
NEM-GNN: DAC/ADC-less, Scalable, Reconfigurable, Graph and Sparsity-Aware Near-Memory Accelerator for Graph Neural Networks. 39 - Yan Chen, Qiwen Ke, Huiba Li, Yongwei Wu, Yiming Zhang:
xMeta: SSD-HDD-hybrid Optimization for Metadata Maintenance of Cloud-scale Object Storage. 40 - Vidush Singhal, Laith Sakka, Kirshanthan Sundararajah, Ryan Newton, Milind Kulkarni:
Orchard: Heterogeneous Parallelism and Fine-grained Fusion for Complex Tree Traversals. 41
Volume 21, Number 3, September 2024
- Hajar Falahati, Mohammad Sadrosadati, Qiumin Xu, Juan Gómez-Luna, Banafsheh Saber Latibari, Hyeran Jeon, Shaahin Hessabi, Hamid Sarbazi-Azad, Onur Mutlu, Murali Annavaram, Masoud Pedram:
Cross-core Data Sharing for Energy-efficient GPUs. 42:1-42:32 - Ching-Jui Lee, Tsung Tai Yeh:
ReSA: Reconfigurable Systolic Array for Multiple Tiny DNN Tensors. 43:1-43:24 - Ziheng Wang, Xiaoshe Dong, Yan Kang, Heng Chen, Qiang Wang:
An Example of Parallel Merkle Tree Traversal: Post-Quantum Leighton-Micali Signature on the GPU. 44:1-44:25 - Jiang Wu, Zhuo Zhang, Deheng Yang, Jianjun Xu, Jiayu He, Xiaoguang Mao:
Knowledge-Augmented Mutation-Based Bug Localization for Hardware Design Code. 45:1-45:26 - Chen Ding, Jian Zhou, Kai Lu, Sicen Li, Yiqin Xiong, Jiguang Wan, Ling Zhan:
D2Comp: Efficient Offload of LSM-tree Compaction with Data Processing Units on Disaggregated Storage. 46:1-46:22 - Zhuohao Wang, Lei Liu, Limin Xiao:
iSwap: A New Memory Page Swap Mechanism for Reducing Ineffective I/O Operations in Cloud Environments. 47:1-47:24 - Junkaixuan Li, Yi Kang:
GraphSER: Distance-Aware Stream-Based Edge Repartition for Many-Core Systems. 48:1-48:25 - Ke Wu, Dezun Dong, Weixia Xu:
COER: A Network Interface Offloading Architecture for RDMA and Congestion Control Protocol Codesign. 49:1-49:26 - Qunyou Liu, Darong Huang, Luis Costero, Marina Zapater, David Atienza:
Intermediate Address Space: virtual memory optimization of heterogeneous architectures for cache-resident workloads. 50:1-50:23 - Dongmoon Min, Ilkwon Byun, Gyu-hyeon Lee, Jangwoo Kim:
CoolDC: A Cost-Effective Immersion-Cooled Datacenter with Workload-Aware Temperature Scaling. 51:1-51:27 - Hai Zhou, Dan Feng:
Stripe-schedule Aware Repair in Erasure-coded Clusters with Heterogeneous Star Networks. 52:1-52:24 - Bobin Deng, Bhargava Nadendla, Kun Suo, Chloe Yixin Xie, Dan Chia-Tien Lo:
Fixed-point Encoding and Architecture Exploration for Residue Number Systems. 53:1-53:27 - Yizhuo Wang, Fangli Chang, Bingxin Wei, Jianhua Gao, Weixing Ji:
Optimization of Sparse Matrix Computation for Algebraic Multigrid on GPUs. 54:1-54:27 - Luming Wang, Xu Zhang, Songyue Wang, Zhuolun Jiang, Tianyue Lu, Mingyu Chen, Siwei Luo, Keji Huang:
Asynchronous Memory Access Unit: Exploiting Massive Parallelism for Far Memory Access. 55:1-55:28 - Yunping Zhao, Sheng Ma, Hengzhu Liu, Dongsheng Li:
SAL: Optimizing the Dataflow of Spin-based Architectures for Lightweight Neural Networks. 56:1-56:27 - Kai Lu, Siqi Zhao, Haikang Shan, Qiang Wei, Guokuan Li, Jiguang Wan, Ting Yao, Huatao Wu, Daohui Wang:
Scythe: A Low-latency RDMA-enabled Distributed Transaction System for Disaggregated Memory. 57:1-57:26 - Wangqi Peng, Yusen Li, Xiaoguang Liu, Gang Wang:
Lavender: An Efficient Resource Partitioning Framework for Large-Scale Job Colocation. 58:1-58:23 - Feng Zhang, Fulin Nan, Binbin Xu, Zhirong Shen, Jiebin Zhai, Dmitrii Kalplun, Jiwu Shu:
Achieving Tunable Erasure Coding with Cluster-Aware Redundancy Transitioning. 59:1-59:24 - Ataberk Olgun, F. Nisa Bostanci, Geraldo Francisco de Oliveira Junior, Yahya Can Tugrul, Rahul Bera, Abdullah Giray Yaglikçi, Hasan Hassan, Oguz Ergin, Onur Mutlu:
Sectored DRAM: A Practical Energy-Efficient and High-Performance Fine-Grained DRAM Architecture. 60:1-60:29 - Xiaohui Wei, Chenyang Wang, Hengshan Yue, Jingweijia Tan, Zeyu Guan, Nan Jiang, Xinyang Zheng, Jianpeng Zhao, Meikang Qiu:
ReIPE: Recycling Idle PEs in CNN Accelerator for Vulnerable Filters Soft-Error Detection. 61:1-61:26 - Qiao Li, Yu Chen, Guanyu Wu, Yajuan Du, Min Ye, Xinbiao Gan, Jie Zhang, Zhirong Shen, Jiwu Shu, Chun Xue:
Characterizing and Optimizing LDPC Performance on 3D NAND Flash Memories. 62:1-62:26 - Jiahong Xu, Haikun Liu, Zhuohui Duan, Xiaofei Liao, Hai Jin, Xiaokang Yang, Huize Li, Cong Liu, Fubing Mao, Yu Zhang:
ReHarvest: An ADC Resource-Harvesting Crossbar Architecture for ReRAM-Based DNN Accelerators. 63:1-63:26 - Jiang Wu, Zhuo Zhang, Deheng Yang, Jianjun Xu, Jiayu He, Xiaoguang Mao:
Time-Aware Spectrum-Based Bug Localization for Hardware Design Code with Data Purification. 64:1-64:25
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.