default search action
30th HPCA 2024: Edinburgh, UK
- IEEE International Symposium on High-Performance Computer Architecture, HPCA 2024, Edinburgh, United Kingdom, March 2-6, 2024. IEEE 2024, ISBN 979-8-3503-9313-2
- Sabuj Laskar, Pranati Majhi, Sungkeun Kim, Farabi Mahmud, Abdullah Muzahid, Eun Jung Kim:
Enhancing Collective Communication in MCM Accelerators for Deep Learning Training. 1-16 - Antonis Psistakis, Fabien Chaix, Josep Torrellas:
MINOS: Distributed Consistency and Persistency Protocol Implementation & Offloading to SmartNICs. 1-17 - Neal Clayton Crago, Sana Damani, Karthikeyan Sankaralingam, Stephen W. Keckler:
WASP: Exploiting GPU Pipeline Parallelism with Hardware-Accelerated Automatic Warp Specialization. 1-16 - Ke Xu, Ming Tang, Quancheng Wang, Han Wang:
Exploitation of Security Vulnerability on Retirement. 1-14 - Rahaf Abdullah, Hyokeun Lee, Huiyang Zhou, Amro Awad:
Salus: Efficient Security Support for CXL-Expanded GPU Memory. 1-15 - Alexander C. Rucker, Shiv Sundram, Coleman Smith, Matthew Vilim, Raghu Prabhakar, Fredrik Kjølstad, Kunle Olukotun:
Revet: A Language and Compiler for Dataflow Threads. 1-14 - Yun Chen, Ali Hajiabadi, Trevor E. Carlson:
GADGETSPINNER: A New Transient Execution Primitive Using the Loop Stream Detector. 15-30 - Chang Liu, Dongsheng Wang, Yongqiang Lyu, Pengfei Qiu, Yu Jin, Zhuoyuan Lu, Yinqian Zhang, Gang Qu:
Uncovering and Exploiting AMD Speculative Memory Access Predictors for Fun and Profit. 31-45 - Dajiang Liu, Yuxin Xia, Jiaxing Shang, Jiang Zhong, Peng Ouyang, Shouyi Yin:
E2EMap: End-to-End Reinforcement Learning for CGRA Compilation via Reverse Mapping. 46-60 - Weichuang Zhang, Jieru Zhao, Guan Shen, Quan Chen, Chen Chen, Minyi Guo:
An Optimizing Framework on MLIR for Efficient FPGA-based Accelerator Generation. 75-90 - Yi Li, Tsun-Yu Yang, Ming-Chang Yang, Zhaoyan Shen, Bingzhe Li:
Celeritas: Out-of-Core Based Unsupervised Graph Neural Network via Cross-Layer Computing 2024. 91-107 - Sumit Walia, Cheng Ye, Arkid Bera, Dhruvi Lodhavia, Yatish Turakhia:
TALCO: Tiling Genome Sequence Alignment Using Convergence of Traceback Pointers. 91-107 - Deniz Gurevin, Mohsin Shan, Shaoyi Huang, Md Amit Hasan, Caiwen Ding, Omer Khan:
PruneGNN: Algorithm-Architecture Pruning Framework for Graph Neural Network Acceleration. 108-123 - Zeyu Zhu, Fanrong Li, Gang Li, Zejian Liu, Zitao Mo, Qinghao Hu, Xiaoyao Liang, Jian Cheng:
MEGA: A Memory-Efficient GNN Accelerator Exploiting Degree-Aware Mixed-Precision Quantization. 124-138 - Jeongmin Hong, Sungjun Cho, Geonwoo Park, Wonhyuk Yang, Young-Ho Gong, Gwangsun Kim:
Bandwidth-Effective DRAM Cache for GPU s with Storage-Class Memory. 139-155 - Jingwei Cai, Zuotong Wu, Sen Peng, Yuchen Wei, Zhanhong Tan, Guiming Shi, Mingyu Gao, Kaisheng Ma:
Gemini: Mapping and Architecture Co-exploration for Large-scale DNN Chiplet Accelerators. 156-171 - Ruixin Mao, Lin Tang, Xingyu Yuan, Ye Liu, Jun Zhou:
Stellar: Energy-Efficient and Low-Latency SNN Algorithm and Hardware Co-Design with Spatiotemporal Computation. 172-185 - Geraldo F. Oliveira, Ataberk Olgun, Abdullah Giray Yaglikçi, F. Nisa Bostanci, Juan Gómez-Luna, Saugata Ghose, Onur Mutlu:
MIMDRAM: An End-to-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data Computing. 186-203 - Seonjin Na, Jungwoo Kim, Sunho Lee, Jaehyuk Huh:
Supporting Secure Multi-GPU Computing with Dynamic and Batched Metadata Management. 204-217 - Yuanchao Xu, James Pangia, Chencheng Ye, Yan Solihin, Xipeng Shen:
Data Enclave: A Data-Centric Trusted Execution Environment. 218-232 - Prasetiyo, Adiwena Putra, Joo-Young Kim:
Morphling: A Throughput-Maximized TFHE-based Accelerator using Transform-domain Reuse. 249-262 - Bongjoon Hyun, Taehun Kim, Dongjae Lee, Minsoo Rhu:
Pathfinding Future PIM Architectures by Demystifying a Commercial PIM Technology. 263-279 - Ismail Emir Yüksel, Yahya Can Tugrul, Ataberk Olgun, F. Nisa Bostanci, Abdullah Giray Yaglikçi, Geraldo F. Oliveira, Haocong Luo, Juan Gómez-Luna, Mohammad Sadrosadati, Onur Mutlu:
Functionally-Complete Boolean Logic in Real DRAM Chips: Experimental Characterization and Analysis. 280-296 - Yuda An, Yunxiao Tang, Shushu Yi, Li Peng, Xiurui Pan, Guangyu Sun, Zhaochu Luo, Qiao Li, Jie Zhang:
StreamPIM: Streaming Matrix Computation in Racetrack Memory. 297-311 - Neel Patel, Amin Mamandipoor, Mohammad Nouri, Mohammad Alian:
SmartDIMM: In-Memory Acceleration of Upper Layer Protocols. 312-329 - Yuyue Wang, Xiurui Pan, Yuda An, Jie Zhang, Glenn Reinman:
BeaconGNN: Large-Scale GNN Acceleration with Out-of-Order Streaming In-Storage Computing. 330-344 - Hongsun Jang, Jaeyong Song, Jaewon Jung, Jaeyoung Park, Youngsok Kim, Jinho Lee:
Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System. 345-360 - Fuping Niu, Jianhui Yue, Jiangqiu Shen, Xiaofei Liao, Hai Jin:
FlashGNN: An In-SSD Accelerator for GNN Training. 361-378 - Donghyun Gouk, Miryeong Kwon, Hanyeoreum Bae, Myoungsoo Jung:
DockerSSD: Containerized In-Storage Processing and Hardware Acceleration for Computational SSDs. 379-394 - Yun Chen, Ali Hajiabadi, Lingfeng Pei, Trevor E. Carlson:
PREFETCHX: Cross-Core Cache-Agnostic Prefetcher-based Side-Channel Attacks. 395-408 - Quancheng Wang, Ming Tang, Ke Xu, Han Wang:
Modeling, Derivation, and Automated Analysis of Branch Predictor Security Vulnerabilities. 409-423 - Xin Zhang, Zhi Zhang, Qingni Shen, Wenhao Wang, Yansong Gao, Zhuoxi Yang, Jiliang Zhang:
SegScope: Probing Fine-grained Interrupts via Architectural Footprints. 424-438 - Gelin Fu, Tian Xia, Zhongpei Luo, Ruiyang Chen, Wenzhe Zhao, Pengju Ren:
Differential-Matching Prefetcher for Indirect Memory Access. 439-453 - Minjae Lee, Seongmin Park, Hyungmin Kim, Minyong Yoon, Janghwan Lee, Jun Won Choi, Nam Sung Kim, Mingu Kang, Jungwook Choi:
SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving. 454-467 - Chenlin Ma, Yingping Wang, Fuwen Chen, Jing Liao, Yi Wang, Rui Mao:
Rapper: A Parameter-Aware Repair-in-Memory Accelerator for Blockchain Storage Platform. 468-482 - Lingyi Huang, Yu Gong, Yang Sui, Xiao Zang, Bo Yuan:
MOPED: Efficient Motion Planning Engine with Flexible Dimension Support. 483-497 - Sebastian S. Kim, Alberto Ros:
Effective Context-Sensitive Memory Dependence Prediction. 515-527 - Alexandre Valentin Jamet, Georgios Vavouliotis, Daniel A. Jiménez, Lluc Alvarez, Marc Casas:
A Two Level Neural Approach Combining Off-Chip Prediction with Adaptive Prefetch Filtering. 528-542 - Odysseas Chatzopoulos, George Papadimitriou, Vasileios Karakostas, Dimitris Gizopoulos:
Gem5-MARVEL: Microarchitecture-Level Resilience Analysis of Heterogeneous SoC Architectures. 543-559 - Abdullah Giray Yaglikçi, Yahya Can Tugrul, Geraldo F. Oliveira, Ismail Emir Yüksel, Ataberk Olgun, Haocong Luo, Onur Mutlu:
Spatial Variation-Aware Read Disturbance Defenses: Experimental Analysis of Real DRAM Chips and Implications on Future Solutions. 560-577 - Anish Saxena, Moinuddin K. Qureshi:
START: Scalable Tracking for any Rowhammer Threshold. 578-592 - F. Nisa Bostanci, Ismail Emir Yüksel, Ataberk Olgun, Konstantinos Kanellopoulos, Yahya Can Tugrul, A. Giray Yaglikçi, Mohammad Sadrosadati, Onur Mutlu:
CoMeT: Count-Min-Sketch-based Row Tracking to Mitigate RowHammer at Low Cost. 593-612 - Theodoros Trochatos, Chuanqi Xu, Sanjay Deshpande, Yao Lu, Yongshan Ding, Jakub Szefer:
A Quantum Computer Trusted Execution Environment. 613 - Jaewan Choi, Jaehyun Park, Kwanhee Kyung, Nam Sung Kim, Jung Ho Ahn:
Unleashing the Potential of PIM: Accelerating Large Batched Inference of Transformer-Based Generative Models. 614 - Joonseop Sim, Soohong Ahn, Taeyoung Ahn, Seungyong Lee, Myunghyun Rhee, Jooyoung Kim, Kwangsik Shin, Donguk Moon, Euiseok Kim, Kyoung Park:
Computational CXL-Memory Solution for Accelerating Memory-Intensive Applications. 615 - Shengzhe Wang, Zihang Lin, Suzhen Wu, Hong Jiang, Jie Zhang, Bo Mao:
LearnedFTL: A Learning-Based Page-Level FTL for Reducing Double Reads in Flash-Based SSDs. 616-629 - Shih-Hung Tseng, Tseng-Yi Chen, Ming-Chang Yang:
Are Superpages Super-fast? Distilling Flash Blocks to Unify Flash Pages of a Superpage in an SSD. 630-642 - Myoungjun Chun, Jaeyong Lee, Myungsuk Kim, Jisung Park, Jihong Kim:
RiF: Improving Read Performance of Modern SSDs Using an On-Die Early-Retry Engine. 643-656 - Qiao Li, Hongyang Dang, Zheng Wan, Congming Gao, Min Ye, Jie Zhang, Tei-Wei Kuo, Chun Jason Xue:
Midas Touch: Invalid-Data Assisted Reliability and Performance Boost for 3d High-Density Flash. 657-670 - Chetan Choppali Sudarshan, Nikhil Matkar, Sarma B. K. Vrudhula, Sachin S. Sapatnekar, Vidya A. Chhabria:
ECO-CHIP: Estimation of Carbon Footprint of Chiplet-based Architectures for Sustainable VLSI. 671-685 - Hanqing Zhu, Jiaqi Gu, Hanrui Wang, Zixuan Jiang, Zhekai Zhang, Rongxing Tang, Chenghao Feng, Song Han, Ray T. Chen, David Z. Pan:
Lightening-Transformer: A Dynamically-Operated Optically-Interconnected Photonic Transformer Accelerator. 686-703 - Evan McKinney, Michael Hatridge, Alex K. Jones:
MIRAGE: Quantum Circuit Decomposition and Routing Collaborative Design Using Mirror Gates. 704-718 - Siddhartha Raman Sundara Raman, Lizy K. John, Jaydeep P. Kulkarni:
SACHI: A Stationarity-Aware, All-Digital, Near-Memory, Ising Architecture. 719-731 - Man Shi, Vikram Jain, Antony Joseph, Maurice Meijer, Marian Verhelst:
BitWave: Exploiting Column-Based Bit-Level Sparsity for Deep Learning Acceleration. 732-746 - Dongseok Im, Hoi-Jun Yoo:
LUTein: Dense-Sparse Bit-Slice Architecture With Radix-4 LUT-Based Slice-Tensor Processing Units. 747-759 - Jaeyong Jang, Yulhwa Kim, Juheun Lee, Jae-Joon Kim:
FIGNA: Integer Unit-Based Accelerator Design for FP-INT GEMM Preserving Numerical Accuracy. 760-773 - Huize Li, Zhaoying Li, Zhenyu Bai, Tulika Mitra:
ASADI: Accelerating Sparse Attention Using Diagonal-based In-Situ Computing. 774-787 - Jie Ren, Dong Xu, Shuangyan Yang, Jiacheng Zhao, Zhicheng Li, Christian Navasca, Chenxi Wang, Guoqing Harry Xu, Dong Li:
Enabling Large Dynamic Neural Network Training with Learning-based Memory Management. 788-802 - Zhiqi Lin, Youshan Miao, Guanbin Xu, Cheng Li, Olli Saarikivi, Saeed Maleki, Fan Yang:
Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search. 803-816 - Yuhui Zhang, Lutan Zhao, Cheng Che, XiaoFeng Wang, Dan Meng, Rui Hou:
SpecFL: An Efficient Speculative Federated Learning System for Tree-based Model Training. 817-831 - Yu-Yuan Liu, Hong-Sheng Zheng, Yu Fang Hu, Chen-Fong Hsu, Tsung Tai Yeh:
TinyTS: Memory-Efficient TinyML Model Compiler Framework on Microcontrollers. 848-860 - Sai Qian Zhang, Thierry Tambe, Nestor Cuevas, Gu-Yeon Wei, David Brooks:
CAMEL: Co-Designing AI Models and eDRAMs for Efficient On-Device Learning. 861-875 - Alexander Buck, Karthik Ganesan, Natalie Enright Jerger:
FlipBit: Approximate Flash Memory for IoT Devices. 876-890 - Cyan Subhra Mishra, Jack Sampson, Mahmut Taylan Kandemir, Vijaykrishnan Narayanan, Chita R. Das:
Usas: A Sustainable Continuous-Learning' Framework for Edge Servers. 891-907 - Wenxue Li, Junyi Zhang, Yufei Liu, Gaoxiong Zeng, Zilong Wang, Chaoliang Zeng, Pengpeng Zhou, Qiaoling Wang, Kai Chen:
Cepheus: Accelerating Datacenter Applications with High-Performance RoCE-Capable Multicast. 908-921 - Yueying Li, Nikita Lazarev, David Koufaty, Tenny Yin, Andy Anderson, Zhiru Zhang, G. Edward Suh, Kostis Kaffes, Christina Delimitrou:
LibPreemptible: Enabling Fast, Adaptive, and Hardware-Assisted User-Space Scheduling. 922-936 - Yanqi Zhang, Zhuangzhuang Zhou, Sameh Elnikety, Christina Delimitrou:
Ursa: Lightweight Resource Management for Cloud-Native Microservices. 954-969 - Sangsoo Park, KyungSoo Kim, Jinin So, Jin Jung, Jonggeon Lee, Kyoungwan Woo, Nayeon Kim, Younghyun Lee, Hyungyo Kim, Yongsuk Kwon, Jinhyun Kim, Jieun Lee, YeonGon Cho, Yongmin Tai, Jeonghyeon Cho, Hoyoung Song, Jung Ho Ahn, Nam Sung Kim:
An LPDDR-based CXL-PNM Platform for TCO-efficient Inference of Transformer-based Large Language Models. 970-982 - Jiexiong Xu, Yiquan Chen, Yijing Wang, Wenhui Shi, Guoju Fang, Yi Chen, Huasheng Liao, Yang Wang, Hai Lin, Zhen Jin, Qiang Liu, Wenzhi Chen:
LightPool: A NVMe-oF-based High-performance and Lightweight Storage Pool Architecture for Cloud-Native Distributed Database. 983-995 - Alper Buyuktosunoglu, David Trilla, Bülent Abali, Deanna Postles Dunn Berger, Craig R. Walters, Jang-Soo Lee:
Enterprise-Class Cache Compression Design. 996-1011 - Gerasimos Gerogiannis, Sriram Aananthakrishnan, Josep Torrellas, Ibrahim Hur:
HotTiles: Accelerating SpMM with Heterogeneous Accelerator Architectures. 1012-1028 - Fangxin Liu, Ning Yang, Haomin Li, Zongwu Wang, Zhuoran Song, Songwen Pei, Li Jiang:
SPARK: Scalable and Precision-Aware Acceleration of Neural Networks via Efficient Encoding. 1029-1042 - Shu-Ting Wang, Hanyang Xu, Amin Mamandipoor, Rohan Mahapatra, Byung Hoon Ahn, Soroush Ghodrati, Krishnan Kailas, Mohammad Alian, Hadi Esmaeilzadeh:
Data Motion Acceleration: Chaining Cross-Domain Multi Accelerators. 1043-1062 - Sudhanshu Gupta, Sandhya Dwarkadas:
RELIEF: Relieving Memory Pressure In SoCs Via Data Movement-Aware Accelerator Scheduling. 1063-1079 - Yueqi Wang, Bingyao Li, Aamer Jaleel, Jun Yang, Xulong Tang:
GRIT: Enhancing Multi-GPU Performance with Fine-Grained Dynamic Page Placement. 1080-1094 - Yalong Shan, Yongkui Yang, Xuehai Qian, Zhibin Yu:
Guser: A GPGPU Power Stressmark Generator. 1111-1124 - Hossein SeyyedAghaei, Mahmood Naderan-Tahan, Lieven Eeckhout:
GPU Scale-Model Simulation. 1125-1140 - Jaeyoon Lee, Wonyeong Jung, Dongwhee Kim, Daero Kim, Junseung Lee, Jungrae Kim:
Agile-DRAM: Agile Trade-Offs in Memory Capacity, Latency, and Energy for Data Centers. 1141-1153 - Xiaoyang Lu, Hamed Najafi, Jason Liu, Xian-He Sun:
CHROME: Concurrency-Aware Holistic Cache Management Framework with Online Reinforcement Learning. 1154-1167 - K. P. Arun, Debadatta Mishra, Biswabandan Panda:
Prosper: Program Stack Persistence in Hybrid Memory Systems. 1168-1183 - Ronglong Wu, Zhirong Shen, Zhiwei Yang, Jiwu Shu:
Mitigating Write Disturbance in Non-Volatile Memory via Coupling Machine Learning with Out-of-Place Updates. 1184-1198
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.