default search action
28th HiPC 2021: Bengaluru, India
- 28th IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2021, Bengaluru, India, December 17-20, 2021. IEEE 2021, ISBN 978-1-6654-1016-8
- Adam Belay:
Improving Efficiency and Performance Through Faster Scheduling Mechanisms. xxii - Jingren Zhou:
Towards an Integral System for Processing Big Graphs at Scale. xxi - Chi Zhang, Sanmukh Rao Kuppannagari, Viktor K. Prasanna:
Parallel Actors and Learners: A Framework for Generating Scalable RL Implementations. 1-10 - Michela Taufer:
AI4IO: A Suite of Ai-Based Tools for IO-Aware HPC Resource Management. 1 - Amal Gueroudji, Julien Bigot, Bruno Raffin:
DEISA: Dask-Enabled In Situ Analytics. 11-20 - A. Srinivas Reddy, P. Krishna Reddy, Anirban Mondal, U. Deva Priyakumar:
A Model of Graph Transactional Coverage Patterns with Applications to Drug Discovery. 21-30 - Eliza Wszola, Martin Jaggi, Markus Püschel:
Faster Parallel Training of Word Embeddings. 31-41 - Nariaki Tateiwa, Yuji Shinano, Keiichiro Yamamura, Akihiro Yoshida, Shizuo Kaji, Masaya Yasuda, Katsuki Fujisawa:
CMAP-LAP: Configurable Massively Parallel Solver for Lattice Problems. 42-52 - Hwajung Kim, Jiwoo Bang, Dong Kyu Sung, Hyeonsang Eom, Heon Y. Yeom, Hanul Sung:
MulConn: User-Transparent I/O Subsystem for High-Performance Parallel File Systems. 53-62 - Ta-Yang Wang, William Chang, Ajitesh Srivastava, Rajgopal Kannan, Viktor K. Prasanna:
Monte Carlo Tree Search for Task Mapping onto Heterogeneous Platforms. 63-70 - Johannes Langguth, Ioannis Panagiotas, Bora Uçar:
Shared-memory implementation of the Karp-Sipser kernelization process. 71-80 - Yuan Meng, Sanmukh R. Kuppannagari, Rajgopal Kannan, Viktor K. Prasanna:
How to Avoid Zero-Spacing in Fractionally-Strided Convolution? A Hardware-Algorithm Co-Design Methodology. 81-90 - Jiawen Guan, Rui Fan:
PPBT: A High Performance Parallel Search Tree. 91-100 - Esragul Korkmaz, Mathieu Faverge, Pierre Ramet, Grégoire Pichon:
Deciding Non-Compressible Blocks in Sparse Direct Solvers using Incomplete Factorization. 101-110 - Athreya Chandramouli, Sayantan Jana, Kishore Kothapalli:
Efficient Parallel Algorithms for Computing Percolation Centrality. 111-120 - André Weißenberger, Bertil Schmidt:
Accelerating JPEG Decompression on GPUs. 121-130 - Kai Keller, Adrián Cristal Kestelman, Leonardo Bautista-Gomez:
Towards Zero-Waste Recovery and Zero-Overhead Checkpointing in Ensemble Data Assimilation. 131-140 - Archie Powell, K. Choudry, Arun Prabhakar, I. Z. Reguly, Dario Amirante, Stephen A. Jarvis, Gihan R. Mudalige:
Predictive Analysis of Large-Scale Coupled CFD Simulations with the CPX Mini-App. 141-151 - Akihiro Tabuchi, Koichi Shirahata, Masafumi Yamazaki, Akihiko Kasagi, Takumi Honda, Kouji Kurihara, Kentaro Kawakami, Tsuguchika Tabaru, Naoto Fukumoto, Akiyoshi Kuroda, Takaaki Fukai, Kento Sato:
The 16, 384-node Parallelism of 3D-CNN Training on An Arm CPU based Supercomputer. 152-161 - Luk Burchard, Xing Cai, Johannes Langguth:
iPUG for Multiple Graphcore IPUs: Optimizing Performance and Scalability of Parallel Breadth-First Search. 162-171 - K. P. Arun, Debadatta Mishra, Biswabandan Panda:
Empirical Analysis of Architectural Primitives for NVRAM Consistency. 172-181 - Kazuaki Matsumura, Simon Garcia de Gonzalo, Antonio J. Peña:
JACC: An OpenACC Runtime Framework with Kernel-Level and Multi-GPU Parallelization. 182-191 - Oded Green, Zhihui Du, Sanyamee Patel, Zehui Xie, Hang Liu, David A. Bader:
Anti-Section Transitive Closure. 192-201 - Xiaojing An, Ümit V. Çatalyürek:
Column-Segmented Sparse Matrix-Matrix Multiplication on Multicore CPUs. 202-211 - Arjun Gopala Krishnan, Dhrubajyoti Goswami:
Multi-Stage Memory Efficient Strassen's Matrix Multiplication on GPU. 212-221 - Md Nahid Newaz, Md Atiqul Mollah:
Optimizing k-path selection for randomized interconnection networks. 222-231 - Siqin Liu, Avinash Karanth:
Dynamic Voltage and Frequency Scaling to Improve Energy-Efficiency of Hardware Accelerators. 232-241 - Zhe Wang, Pradeep Subedi, Matthieu Dorier, Philip E. Davis, Manish Parashar:
Adaptive Placement of Data Analysis Tasks For Staging Based In-Situ Processing. 242-251 - Qihan Wang, Wei Niu, Li Chen, Ruoming Jin, Bin Ren:
HEALS: A Parallel eALS Recommendation System on CPU/GPU Heterogeneous Platforms. 252-261 - Xiang Li, Gagan Agrawal:
Shrinking Sample Search Algorithm for Automatic Tuning of GPU Kernels. 262-271 - Bharath Ramesh, Jahanzeb Maqbool Hashmi, Shulei Xu, Aamir Shafi, Seyedeh Mahdieh Ghazimirsaeed, Mohammadreza Bayatpour, Hari Subramoni, Dhabaleswar K. Panda:
Towards Architecture-aware Hierarchical Communication Trees on Modern HPC Systems. 272-281 - Yuntian He, Saket Gurukar, Pouya Kousha, Hari Subramoni, Dhabaleswar K. Panda, Srinivasan Parthasarathy:
DistMILE: A Distributed Multi-Level Framework for Scalable Graph Embedding. 282-291 - Jinlai Xu, Balaji Palanisamy:
Model-based Reinforcement Learning for Elastic Stream Processing in Edge Computing. 292-301 - Kaushik Kandadi Suresh, Bharath Ramesh, Chen-Chun Chen, Seyedeh Mahdieh Ghazimirsaeed, Mohammadreza Bayatpour, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
Layout-aware Hardware-assisted Designs for Derived Data Types in MPI. 302-311 - Xu T. Liu, Jesun Firoz, Andrew Lumsdaine, Cliff A. Joslyn, Sinan Aksoy, Brenda Praggastis, Assefaw H. Gebremedhin:
Parallel Algorithms for Efficient Computation of High-Order Line Graphs of Hypergraphs. 312-321 - Sunwoo Lee, Qiao Kang, Kewei Wang, Jan Balewski, Alex Sim, Ankit Agrawal, Alok N. Choudhary, Peter Nugent, Kesheng Wu, Wei-keng Liao:
Asynchronous I/O Strategy for Large-Scale Deep Learning Applications. 322-331 - Srinivasan Ramesh, Robert B. Ross, Matthieu Dorier, Allen D. Malony, Philip H. Carns, Kevin A. Huck:
SYMBIOMON: A High-Performance, Composable Monitoring Service. 332-342 - Ke Fan, Duong Hoang, Steve Petruzza, Thomas Gilray, Valerio Pascucci, Sidharth Kumar:
Load-balancing Parallel I/O of Compressed Hierarchical Layouts. 343-353 - Madhav Poudel, Michael Gowanlock:
CUDA-DClust+: Revisiting Early GPU-Accelerated DBSCAN Clustering Designs. 354-363 - Leonel Toledo, Pedro Valero-Lara, Jeffrey S. Vetter, Antonio J. Peña:
Static Graphs for Coding Productivity in OpenACC. 364-369 - Madhav Aggarwal, Bingyi Zhang, Viktor K. Prasanna:
Performance of Local Push Algorithms for Personalized PageRank on Multi-core Platforms. 370-375 - Jacob Tronge, Patricia Grubel, Timothy Randles, Quincy Wofford, Rusty Davis, Steven Anaya, Qiang Guan:
BEE Orchestrator: Running Complex Scientific Workflows on Multiple Systems. 376-381 - Hércules Cardoso da Silva, Marco Aurelio Stefanes, Vinícius Capistrano:
OpenACC Multi-GPU Approach for WSM6 Microphysics. 382-387 - Nick Sarkauskas, Mohammadreza Bayatpour, Tu Tran, Bharath Ramesh, Hari Subramoni, Dhabaleswar K. Panda:
Large-Message Nonblocking MPI_Iallgather and MPI Ibcast Offload via BlueField-2 DPU. 388-393 - Yuanjian Liu, Sheng Di, Kai Zhao, Sian Jin, Cheng Wang, Kyle Chard, Dingwen Tao, Ian T. Foster, Franck Cappello:
Optimizing Multi-Range based Error-Bounded Lossy Compression for Scientific Datasets. 394-399 - Jiwoo Bang, Chungyong Kim, Kesheng Wu, Alex Sim, Suren Byna, Hanul Sung, Hyeonsang Eom:
An In-Depth I/O Pattern Analysis in HPC Systems. 400-405 - Anshuj Garg, Purushottam Kulkarni, Umesh Bellur, Sriram Yenamandra:
FaaSter: Accelerated Functions-as-a-Service with Heterogeneous GPUs. 406-411 - Salman Salloum, Joshua Zhexue Huang:
RSP-Hist: Approximate Histograms for Big Data Exploration on Hadoop Clusters. 412-417 - Shuangsheng Lou, Gagan Agrawal:
A Programming API Implementation for Secure Data Analytics Applications with Homomorphic Encryption on GPUs. 418-423 - Jia Guo, Radu Teodorescu, Gagan Agrawal:
A Fused Inference Design for Pattern-Based Sparse CNN on Edge Devices. 424-429 - Edigley Fraga, Ana Cortés, Tomàs Margalef, Porfidio Hernández:
Cloud-Based Urgent Computing for Forest Fire Spread Prediction under Data Uncertainties. 430-435 - Mostafa Eghbali Zarch, Reece Neff, Michela Becchi:
Exploring Thread Coarsening on FPGA. 436-441 - John Ravi, Tri Nguyen, Huiyang Zhou, Michela Becchi:
PILOT: a Runtime System to Manage Multi-tenant GPU Unified Memory Footprint. 442-447 - S. Chandra Sekhara Rao, Rabia Kamra:
A computational technique for parallel solution of diagonally dominant banded linear systems. 448-453
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.