default search action
32nd ICS 2018: Beijing, China
- Proceedings of the 32nd International Conference on Supercomputing, ICS 2018, Beijing, China, June 12-15, 2018. ACM 2018, ISBN 978-1-4503-5783-8
File system, I/O and Storage System
- Jinrui Cao, Om Rameshwar Gatla, Mai Zheng, Dong Dai, Vidya Eswarappa, Yan Mu, Yong Chen:
PFault: A General Framework for Analyzing the Reliability of High-Performance Parallel File Systems. 1-11 - Jie Yu, Guangming Liu, Xin Liu, Wenrui Dong, Xiaoyong Li, Yusheng Liu:
Rethinking Node Allocation Strategy for Data-intensive Applications in Consideration of Spatially Bursty I/O. 12-21 - Wenhui Zhang, Qiang Cao, Hong Jiang, Jie Yao:
PA-SSD: A Page-Type Aware TLC SSD for Improved Write/Read Performance and Storage Efficiency. 22-32 - Anthony Kougkas, Hariharan Devarajan, Xian-He Sun:
IRIS: I/O Redirection via Integrated Storage. 33-42
GPUs-I: Execution Model
- Husheng Zhou, Soroush Bateni, Cong Liu:
GRU: Exploring Computation and Data Redundancy via Partial GPU Computing Result Reuse. 43-52 - Ang Li, Weifeng Liu, Linnan Wang, Kevin J. Barker, Shuaiwen Leon Song:
Warp-Consolidation: A Novel Execution Model for GPUs. 53-64 - Xia Zhao, Zhiying Wang, Lieven Eeckhout:
Classification-Driven Search for Effective SM Partitioning in Multitasking GPUs. 65-75
GPUs-II: GPU and Algorithm
- Bernhard Kerbl, Michael Kenzel, Joerg H. Mueller, Dieter Schmalstieg, Markus Steinberger:
The Broker Queue: A Fast, Linearizable FIFO Queue for Fine-Granular Work Distribution on the GPU. 76-85 - Ben Karsin, Volker Weichert, Henri Casanova, John Iacono, Nodari Sitchinava:
Analysis-driven Engineering of Comparison-based Sorting Algorithms on GPUs. 86-95 - Jinsung Kim, Aravind Sukumaran-Rajam, Changwan Hong, Ajay Panyala, Rohit Kumar Srivastava, Sriram Krishnamoorthy, P. Sadayappan:
Optimizing Tensor Contractions in CCSD(T) for Efficient Execution on GPUs. 96-106
Architecture
- Zhaoxiang Jin, Soner Önder:
A two-phase recovery mechanism. 107-117 - Reena Panda, Lizy K. John:
HALO: A Hierarchical Memory Access Locality Modeling Technique For Memory System Explorations. 118-128 - Jose Antonio Pascual, Javier Navaridas:
High-Performance, Low-Complexity Deadlock Avoidance for Arbitrary Topologies/Routings. 129-138
Accelerator
- Dongwoo Lee, Sungbum Kang, Kiyoung Choi:
ComPEND: Computation Pruning through Early Negative Detection for ReLU in a Deep Neural Network Accelerator. 139-148 - Hao Yan, Hebin R. Cherian, Ethan C. Ahn, Lide Duan:
CELIA: A Device and Architecture Co-Design Framework for STT-MRAM-Based Deep Learning Acceleration. 149-159 - Jacob Lambert, Seyong Lee, Jungwon Kim, Jeffrey S. Vetter, Allen D. Malony:
Directive-Based, High-Level Programming and Optimizations for High-Performance Computing with FPGAs. 160-171
Application and Programming Framework
- Xue Li, Mingxing Zhang, Kang Chen, Yongwei Wu:
ReGraph: A Graph Processing Framework that Alternately Shrinks and Repartitions the Graph. 172-183 - Xiuhong Li, Yun Liang, Wentai Zhang, Taide Liu, Haochen Li, Guojie Luo, Ming Jiang:
cuMBIR: An Efficient Framework for Low-dose X-ray CT Image Reconstruction on GPUs. 184-194 - Feng Zhang, Jidong Zhai, Xipeng Shen, Onur Mutlu, Wenguang Chen:
Zwift: A Programming Framework for High Performance Text Analytics on Compressed Data. 195-206
Runtime System and Library
- Isaac Sánchez Barrera, Miquel Moretó, Eduard Ayguadé, Jesús Labarta, Mateo Valero, Marc Casas:
Reducing Data Movement on Large Shared Memory Systems by Exploiting Computation Dependencies. 207-217 - Lluc Alvarez, Marc Casas, Jesús Labarta, Eduard Ayguadé, Mateo Valero, Miquel Moretó:
Runtime-Guided Management of Stacked DRAM Memories in Task Parallel Programs. 218-228 - François Tessier, Paul Gressier, Venkatram Vishwanath:
Optimizing Data Aggregation by Leveraging the Deep Memory Hierarchy on Large-scale Systems. 229-239
Program Analysis
- Lai Wei, John M. Mellor-Crummey:
Automated Analysis of Time Series Data to Understand Parallel Program Behaviors. 240-251 - Hui Zhang, Jeffrey K. Hollingsworth:
ChplBlamer: A Data-centric and Code-centric Combined Profiler for Multi-locale Chapel Programs. 252-262 - Shasha Wen, Lucy Cherkasova, Felix Xiaozhu Lin, Xu Liu:
ProfDP: A Lightweight Profiler to Guide Data Placement in Heterogeneous Memory Systems. 263-273
System Design
- Nadja Peters, Sangyoung Park, Daniel Clifford, S. Kyostila, Ross McIlroy, Benedikt Meurer, Hannes Payer, Samarjit Chakraborty:
Phase-Aware Web Browser Power Management on HMP Platforms. 274-283 - Ke Zhou, Si Sun, Hua Wang, Ping Huang, Xubin He, Rui Lan, Wenyan Li, Wenjie Liu, Tianming Yang:
Demystifying Cache Policies for Photo Stores at Scale: A Tencent Case Study. 284-294 - Zhihao Jia, Sean Treichler, Galen M. Shipman, Patrick S. McCormick, Alex Aiken:
Isometry: A Path-Based Distributed Data Transfer System. 295-306
Parallel Algorithm
- Yang You, James Demmel, Cho-Jui Hsieh, Richard W. Vuduc:
Accurate, Fast and Scalable Kernel Ridge Regression on Parallel and Distributed Systems. 307-317 - Keke Zhai, Tania Banerjee, David Zwick, Jason Hackl, Sanjay Ranka:
Dynamic Load Balancing for Compressible Multiphase Turbulence. 318-327
Compiler and OS
- Jiacheng Zhao, Huimin Cui, Yalin Zhang, Jingling Xue, Xiaobing Feng:
Revisiting Loop Tiling for Datacenters: Live and Let Live. 328-340 - Shikai Li, Sunghyun Park, Scott A. Mahlke:
Sculptor: Flexible Approximation with Selective Dynamic Loop Perforation. 341-351 - Jee Ho Ryoo, Lizy K. John, Arkaprava Basu:
A Case for Granularity Aware Page Migration. 352-362
Optimization and Performance Tuning
- Changxi Liu, Biwei Xie, Xin Liu, Wei Xue, Hailong Yang, Xu Liu:
Towards Efficient SpMV on Sunway Manycore Architectures. 363-373 - Venkatesan T. Chakaravarthy, Jee W. Choi, Douglas J. Joseph, Prakash Murali, Shivmaran S. Pandian, Yogish Sabharwal, Dheeraj Sreedhar:
On Optimizing Distributed Tucker Decomposition for Sparse Tensors. 374-384 - Jayaraman J. Thiagarajan, Nikhil Jain, Rushil Anirudh, Alfredo Giménez, Rahul Sridhar, Aniruddha Marathe, Tao Wang, Murali Emani, Abhinav Bhatele, Todd Gamblin:
Bootstrapping Parameter Space Exploration for Fast Tuning. 385-395
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.