default search action
47th ICPP 2018: Eugene, OR, USA
- Proceedings of the 47th International Conference on Parallel Processing, ICPP 2018, Eugene, OR, USA, August 13-16, 2018. ACM 2018
Best Paper
- Yang You, Zhao Zhang, Cho-Jui Hsieh, James Demmel, Kurt Keutzer:
ImageNet Training in Minutes. 1:1-1:10
Graph Applications
- Kun Qiu, Yuanyang Zhu, Jing Yuan, Jin Zhao, Xin Wang, Tilman Wolf:
ParaPLL: Fast Parallel Shortest-path Distance Query on Large-scale Weighted Graphs. 2:1-2:10 - Xianghao Xu, Fang Wang, Hong Jiang, Yongli Cheng, Dan Feng, Yongxuan Zhang:
HUS-Graph: I/O-Efficient Out-of-Core Graph Processing with Hybrid Update Strategy. 3:1-3:10 - Jianping Zeng, Hongfeng Yu:
A Distributed Infomap Algorithm for Scalable and High-Quality Community Detection. 4:1-4:11
Monitoring and Network Analysis
- Ramin Izadpanah, Nichamon Naksinehaboon, Jim M. Brandt, Ann C. Gentile, Damian Dechev:
Integrating Low-latency Analysis into HPC System Monitoring. 5:1-5:10 - Arya Mazaheri, Felix Wolf, Ali Jannesari:
Unveiling Thread Communication Bottlenecks Using Hardware-Independent Metrics. 6:1-6:10 - Kevin A. Brown, Nikhil Jain, Satoshi Matsuoka, Martin Schulz, Abhinav Bhatele:
Interference between I/O and MPI Traffic on Fat-tree Networks. 7:1-7:10
Task Placement Algorithms
- Amelie Chi Zhou, Tien-Dat Phan, Shadi Ibrahim, Bingsheng He:
Energy-Efficient Speculative Execution using Advanced Reservation for Heterogeneous Clusters. 8:1-8:10 - Roland Glantz, Maria Predari, Henning Meyerhenke:
Topology-induced Enhancement of Mappings. 9:1-9:10 - Haipeng Dai, Ke Sun, Alex X. Liu, Lijun Zhang, Jiaqi Zheng, Guihai Chen:
Charging Task Scheduling for Directional Wireless Charger Networks. 10:1-10:10
Astronomy and Earth Systems
- Thomas R. Devine, Katerina Goseva-Popstojanova, Di Pang:
Scalable Solutions for Automated Single Pulse Identification and Classification in Radio Astronomy. 11:1-11:11 - Junmin Xiao, Shigang Li, Baodong Wu, He Zhang, Kun Li, Erlin Yao, Yunquan Zhang, Guangming Tan:
Communication-Avoiding for Dynamical Core of Atmospheric General Circulation Model. 12:1-12:10 - Satish Puri, Anmol Paudel, Sushil K. Prasad:
MPI-Vector-IO: Parallel I/O and Partitioning for Geospatial Vector Data. 13:1-13:11
Networking Algorithms
- Yang Chen, Jie Wu:
NFV Middlebox Placement with Balanced Set-up Cost and Bandwidth Consumption. 14:1-14:10 - Xu Lin, Deke Guo, Yulong Shen, Guoming Tang, Bangbang Ren:
DAG-SFC: Minimize the Embedding Cost of SFC with Parallel VNFs. 15:1-15:10 - Xiaoyu Wang, Haipeng Dai, Weijun Wang, Jiaqi Zheng, Guihai Chen, Wanchun Dou, Xiaobing Wu:
Heterogeneous Wireless Charger Placement with Obstacles. 16:1-16:10
Performance Tools and Methodologies
- Ajay Ramaswamy, Nalini Kumar, Aravind Neelakantan, Herman Lam, Greg Stitt:
Scalable Behavioral Emulation of Extreme-Scale Systems Using Structural Simulation Toolkit. 17:1-17:11 - Brian Kocoloski, John R. Lange:
Varbench: an Experimental Framework to Measure and Characterize Performance Variability. 18:1-18:10 - François Trahay, Manuel Selva, Lionel Morel, Kevin Marquet:
NumaMMA: NUMA MeMory Analyzer. 19:1-19:10
Algorithms
- Rintu Panja, Sathish Vadhiyar:
MND-MST: A Multi-Node Multi-Device Parallel Boruvka's MST Algorithm. 20:1-20:10 - Zachary Blanco, Bangtian Liu, Maryam Mehri Dehnavi:
CSTF: Large-Scale Sparse Tensor Factorizations on Distributed Platforms. 21:1-21:10 - Saeed Soori, Aditya Devarakonda, Zachary Blanco, James Demmel, Mert Gürbüzbalaban, Maryam Mehri Dehnavi:
Reducing Communication in Proximal Newton Methods for Sparse Least Squares Problems. 22:1-22:10 - Juan Zhao, Junqiang Song, Min Zhu, Jincai Li, Zhenyu Huang, Xiaoyong Li, Xiaoli Ren:
PBCS: An Efficient Parallel Characteristic Set Method for Solving Boolean Polynomial Systems. 23:1-23:10
Performance on GPU Systems
- Lionel Eyraud-Dubois, Thomas Lambert:
Using Static Allocation Algorithms for Matrix Matrix Multiplication on Multicores and GPUs. 24:1-24:10 - Zhuohang Lai, Qiong Luo, Xiaoying Jia:
Revisiting Multi-pass Scatter and Gather on GPUs. 25:1-25:11 - Wei Tan, Shiyu Chang, Liana Fong, Cheng Li, Zijun Wang, Liangliang Cao:
Matrix Factorization on GPUs with Memory Optimization and Approximate Computing. 26:1-26:10 - André Weißenberger, Bertil Schmidt:
Massively Parallel Huffman Decoding on GPUs. 27:1-27:10
Scheduling Algorithms
- Li Han, Valentin Le Fèvre, Louis-Claude Canon, Yves Robert, Frédéric Vivien:
A Generic Approach to Scheduling and Checkpointing Workflows. 28:1-28:10 - Yibo Jin, Zhuzhong Qian, Song Guo, Sheng Zhang, Xiaoliang Wang, Sanglu Lu:
ran-GJS: Orchestrating Data Analytics for Heterogeneous Geo-distributed Edges. 29:1-29:10 - Binlei Cai, Rongqi Zhang, Laiping Zhao, Keqiu Li:
Less Provisioning: A Fine-grained Resource Scaling Engine for Long-running Services with Tail Latency Guarantees. 30:1-30:11 - Brandon Nesterenko, Qing Yi, Jia Rao:
Improving Resource Utilization through Demand Aware Process Scheduling. 31:1-31:10
Machine Learning and Networks
- Haitao Zhang, Bingchang Tang, Xin Geng, Huadong Ma:
Learning Driven Parallelization for Large-Scale Video Workload in Hybrid CPU-GPU Cluster. 32:1-32:10 - Hao Fu, Shanjiang Tang, Bingsheng He, Ce Yu, Jizhou Sun:
GLP4NN: A Convergence-invariant and Network-agnostic Light-weight Parallelization Framework for Deep Neural Networks on Modern GPUs. 33:1-33:10 - Xinyu Chen, Jeremy Benson, Matt Peterson, Michela Taufer, Trilce Estrada:
KeyBin2: Distributed Clustering for Scalable and In-Situ Analysis. 34:1-34:10 - Jiang Xiao, Zhuang Xiong, Song Wu, Yusheng Yi, Hai Jin, Kan Hu:
Disk Failure Prediction in Data Centers via Online Learning. 35:1-35:10
Memory Performance
- Anne Benoit, Swann Perarnau, Loïc Pottier, Yves Robert:
A Performance Model to Execute Workflows on High-Bandwidth-Memory Architectures. 36:1-36:10 - Neil Butcher, Stephen L. Olivier, Jonathan W. Berry, Simon D. Hammond, Peter M. Kogge:
Optimizing for KNL Usage Modes When Data Doesn't Fit in MCDRAM. 37:1-37:10 - Mohamed Mohamedin, Sebastiano Peluso, Masoomeh Javidi Kishi, Ahmed Hassan, Roberto Palmieri:
Nemo: NUMA-aware Concurrency Control for Scalable Transactional Memory. 38:1-38:10 - Kai Xu, Robin Kobus, Yuandong Chan, Ping Gao, Xiangxu Meng, Yanjie Wei, Bertil Schmidt, Weiguo Liu:
SPECTR: Scalable Parallel Short Read Error Correction on Multi-core and Many-core Architectures. 39:1-39:10
Networking
- Qian Liu, Haipeng Dai, Alex X. Liu, Qi Li, Xiaoyu Wang, Jiaqi Zheng:
Cache Assisted Randomized Sharing Counters in Network Measurement. 40:1-40:10 - Md. Shafayat Rahman, Md Atiqul Mollah, Peyman Faizian, Xin Yuan:
Load-Balanced Slim Fly Networks. 41:1-41:10 - Jiayao Wang, Abdullah Al-Mamun, Tonglin Li, Linhua Jiang, Dongfang Zhao:
Toward Performant and Energy-efficient Queries in Three-tier Wireless Sensor Networks. 42:1-42:10 - Anping He, Guangbo Feng, Jilin Zhang, Pengfei Li, Yong Hei, Hong Chen:
Click-Based Asynchronous Mesh Network with Bounded Bundled Data. 43:1-43:8
Machine Learning
- Longlong Liao, Kenli Li, Keqin Li, Canqun Yang, Qi Tian:
UHCL-Darknet: An OpenCL-based Deep Neural Network Framework for Heterogeneous Multi-/Many-core Clusters. 44:1-44:10 - Connor Imes, Steven A. Hofmeyr, Henry Hoffmann:
Energy-efficient Application Resource Scheduling using Machine Learning Classifiers. 45:1-45:11 - Michael R. Wyatt II, Stephen Herbein, Todd Gamblin, Adam Moody, Dong H. Ahn, Michela Taufer:
PRIONN: Predicting Runtime and IO using Neural Networks. 46:1-46:12
Materials and Molecular Dynamics
- Shigang Li, Baodong Wu, Yunquan Zhang, Xianmeng Wang, Jianjiang Li, Changjun Hu, Jue Wang, Yangde Feng, Ningming Nie:
Massively Scaling the Metal Microscopic Damage Simulation on Sunway TaihuLight Supercomputer. 47:1-47:11 - Raphaël Prat, Laurent Colombet, Raymond Namyst:
Combining Task-based Parallelism and Adaptive Mesh Refinement Techniques in Molecular Dynamics Simulations. 48:1-48:10 - Ioannis Paraskevakos, André Luckow, Mahzad Khoshlessan, George Chantzialexiou, Thomas E. Cheatham, Oliver Beckstein, Geoffrey C. Fox, Shantenu Jha:
Task-parallel Analysis of Molecular Dynamics Trajectories. 49:1-49:10
Performance Studies
- Meng Tang, Mohamed Gadou, Steven C. Rennich, Timothy A. Davis, Sanjay Ranka:
A Multilevel Subtree Method for Single and Batched Sparse Cholesky Factorization. 50:1-50:10 - Jan Hückelheim, Paul D. Hovland, Sri Hari Krishna Narayanan, Paulius Velesko:
Vectorised Computation of Diverging Ensembles. 51:1-51:10 - Moritz von Looz, Charilaos Tzovas, Henning Meyerhenke:
Balanced k-means for Parallel Geometric Partitioning. 52:1-52:10
Performance of Sparse Algorithms
- Xinliang Wang, Ping Xu, Wei Xue, Yulong Ao, Chao Yang, Haohuan Fu, Lin Gan, Guangwen Yang, Weimin Zheng:
A Fast Sparse Triangular Solver for Structured-grid Problems on Sunway Many-core Processor SW26010. 53:1-53:11 - Qiao Sun, Changyou Zhang, Changmao Wu, Jiajia Zhang, Leisheng Li:
Bandwidth Reduced Parallel SpMV on the SW26010 Many-Core Platform. 54:1-54:10 - Hong Zhang, Richard Tran Mills, Karl Rupp, Barry F. Smith:
Vectorized Parallel Sparse Matrix-Vector Multiplication in PETSc Using AVX-512. 55:1-55:10
Programming Models
- Brandon Hedden, Xinghui Zhao:
A Comprehensive Study on Bugs in Actor Systems. 56:1-56:9 - Konstantinos Krommydas, Paul Sathre, Ruchira Sasanka, Wu-chun Feng:
A Framework for Auto-Parallelization and Code Generation: An Integrative Case Study with Legacy FORTRAN Codes. 57:1-57:10 - Nathan T. Hjelm, Matthew G. F. Dosanjh, Ryan E. Grant, Taylor L. Groves, Patrick G. Bridges, Dorian C. Arnold:
Improving MPI Multi-threaded RMA Communication Performance. 58:1-58:11
Resilience and Reliability
- Omer Subasi, Chun-Kai Chang, Mattan Erez, Sriram Krishnamoorthy:
Characterizing the Impact of Soft Errors Affecting Floating-point ALUs using RTL-Ievel Fault Injection. 59:1-59:10 - Zhichao Yan, Hong Jiang, Witawas Srisa-an, Sharad C. Seth, Yujuan Tan:
Leverage Redundancy in Hardware Transactional Memory to Improve Cache Reliability. 60:1-60:10 - Kai Wu, Wenqian Dong, Qiang Guan, Nathan DeBardeleben, Dong Li:
Modeling Application Resilience in Large-scale Parallel Execution. 61:1-61:10
Memory and Caching
- Xi Wang, John D. Leidel, Yong Chen:
Memory Coalescing for Hybrid Memory Cube. 62:1-62:10 - Muhammad M. Rafique, Zhichun Zhu:
CAMPS: Conflict-Aware Memory-Side Prefetching Scheme for Hybrid Memory Cube. 63:1-63:9 - Xian Zhu, Robert Wernsman, Joseph Zambreno:
Improving First Level Cache Efficiency for GPUs Using Dynamic Line Protection. 64:1-64:10 - Yuanrong Wang, Xueqi Li, Dawei Zang, Guangming Tan, Ninghui Sun:
Accelerating FM-index Search for Genomic Data Processing. 65:1-65:12
Resource Management
- Donglin Yang, Wei Rang, Dazhao Cheng:
Joint Optimization of MapReduce Scheduling and Network Policy in Hierarchical Clouds. 66:1-66:10 - Huazhe Zhang, Henry Hoffmann:
Performance & Energy Tradeoffs for Dependent Distributed Applications Under System-wide Power Caps. 67:1-67:11 - Minghao Zhao, Zhenhua Li, Ennan Zhai, Gareth Tyson, Chen Qian, Zhenyu Li, Leiyu Zhao:
H2Cloud: Maintaining the Whole Filesystem in an Object Storage Cloud. 68:1-68:10 - Xuesong Li, Wenxue Cheng, Tong Zhang, Jing Xie, Fengyuan Ren, Bailong Yang:
Power Efficient High Performance Packet I/O. 69:1-69:10
Runtime Systems
- D. Brian Larkins, John Snyder, James Dinan:
Efficient Runtime Support for a Partitioned Global Logical Address Space. 70:1-70:10 - Hoang-Vu Dang, Marc Snir:
FULT: Fast User-Level Thread Scheduling Using Bit-Vectors. 71:1-71:10 - Jason Hiebel, Laura E. Brown, Zhenlin Wang:
Constructing Dynamic Policies for Paging Mode Selection. 72:1-72:9 - Matthew G. F. Dosanjh, S. Mahdieh Ghazimirsaeed, Ryan E. Grant, Whit Schonbein, Michael J. Levenhagen, Patrick G. Bridges, Ahmad Afsahi:
The Case for Semi-Permanent Cache Occupancy: Understanding the Impact of Data Locality on Network Processing. 73:1-73:11
Parallel and Distributed Algorithms
- João Paulo de Araujo, Luciana Arantes, Elias P. Duarte Jr., Luiz A. Rodrigues, Pierre Sens:
A Communication-Efficient Causal Broadcast Protocol. 74:1-74:10 - Saurabh Kalikar, Rupesh Nasre:
NumLock: Towards Optimal Multi-Granularity Locking in Hierarchies. 75:1-75:10 - Fei Wang, Xiaofeng Gao, Jun Ye, Guihai Chen:
IS-ASGD: Accelerating Asynchronous SGD using Importance Sampling. 76:1-76:11
Performance of Graph Algorithms
- Yulin Che, Shixuan Sun, Qiong Luo:
Parallelizing Pruning-based Graph Structural Clustering. 77:1-77:10 - Deepak Ajwani, Erika Duriakova, Neil Hurley, Ulrich Meyer, Alexander Schickedanz:
An Empirical Comparison of k-Shortest Simple Path Algorithms on Multicores. 78:1-78:12 - Li Zhou, Ren Chen, Yinglong Xia, Radu Teodorescu:
C-Graph: A Highly Efficient Concurrent Graph Reachability Query Framework. 79:1-79:10
Storage
- Zhirong Shen, Patrick P. C. Lee:
Cross-Rack-Aware Updates in Erasure-Coded Data Centers. 80:1-80:10 - Xuchao Xie, Tianye Yang, Qiong Li, Dengping Wei, Liquan Xiao:
Duchy: Achieving Both SSD Durability and Controllable SMR Cleaning Overhead in Hybrid Storage Systems. 81:1-81:9 - Hua Wang, Xinbo Yi, Ping Huang, Bin Cheng, Ke Zhou:
Efficient SSD Caching by Avoiding Unnecessary Writes using Machine Learning. 82:1-82:10
Data processing
- Song Wu, Zhiyi Liu, Shadi Ibrahim, Lin Gu, Hai Jin, Fei Chen:
Dual-Paradigm Stream Processing. 83:1-83:10 - Yusen Li, Xueyan Tang, Wentong Cai, Jiancong Tong, Xiaoguang Liu, Gang Wang, Chuansong Gao, Xuan Cao, Guanhui Geng, Minghui Li:
Index Shard Replication Strategies for Improving Resource Utilization in Large Scale Search Engines. 84:1-84:10 - Chen Zhang, Qiang Cao, Hong Jiang, Wenhui Zhang, Jingjun Li, Jie Yao:
FFS-VA: A Fast Filtering System for Large-scale Video Analytics. 85:1-85:10
I/O and File Systems
- Ram Kesavan, Matthew Curtis-Maury, Mrinal K. Bhattacharjee:
Efficient Search for Free Blocks in the WAFL File System. 86:1-86:10 - Xiaoyi Zhang, Dan Feng, Yu Hua, Jianxi Chen, Mandi Fu:
A Write-efficient and Consistent Hashing Scheme for Non-Volatile Memory. 87:1-87:10 - Tiago B. G. Perez, Xiaobo Zhou, Dazhao Cheng:
Reference-distance Eviction and Prefetching for Cache Management in Spark. 88:1-88:10
Matrix and Graph Algorithms
- Carl Yang, Aydin Buluç, John D. Owens:
Implementing Push-Pull Efficiently in GraphBLAS. 89:1-89:11 - Oguz Kaya, Ramakrishnan Kannan, Grey Ballard:
Partitioning and Communication Strategies for Sparse Non-negative Matrix Factorization. 90:1-90:10 - Michael P. Lingg, Stephen M. Hughey, Hasan Metin Aktulga:
Optimization of the Spherical Harmonics Transform based Tree Traversals in the Helmholtz FMM Algorithm. 91:1-91:11
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.