default search action
PACT 2016: Haifa, Israel
- Ayal Zaks, Bilha Mendelson, Lawrence Rauchwerger, Wen-mei W. Hwu:
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, PACT 2016, Haifa, Israel, September 11-15, 2016. ACM 2016, ISBN 978-1-4503-4121-9
Session 1: Keynote
- Arvind:
Big Data Analytics on Flash Storage with Accelerators. 1
Session 2A: GPU - Architectures
- Jingweijia Tan, Shuaiwen Leon Song, Kaige Yan, Xin Fu, Andrès Márquez, Darren J. Kerbyson:
Combating the Reliability Challenge of GPU Register File at Low Supply Voltage. 3-15 - Onur Kayiran, Adwait Jog, Ashutosh Pattnaik, Rachata Ausavarungnirun, Xulong Tang, Mahmut T. Kandemir, Gabriel H. Loh, Onur Mutlu, Chita R. Das:
μC-States: Fine-grained GPU Datapath Power Management. 17-30 - Ashutosh Pattnaik, Xulong Tang, Adwait Jog, Onur Kayiran, Asit K. Mishra, Mahmut T. Kandemir, Onur Mutlu, Chita R. Das:
Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities. 31-44 - Bin Wang, Yue Zhu, Weikuan Yu:
OAWS: Memory Occlusion Aware Warp Scheduling. 45-55
Session 2B: Performance Optimizations
- Bruno Bodin, Luigi Nardi, M. Zeeshan Zia, Harry Wagstaff, Govind Sreekar Shenoy, Murali Krishna Emani, John Mawer, Christos Kotselidis, Andy Nisbet, Mikel Luján, Björn Franke, Paul H. J. Kelly, Michael F. P. O'Boyle:
Integrating Algorithmic Parameters into Benchmarking and Design Space Exploration in 3D Scene Understanding. 57-69 - Mads Ruben Burgdorff Kristensen, Simon Andreas Frimann Lund, Troels Blum, James Avery:
Fusion of Parallel Array Operations. 71-85 - Chandan Reddy, Michael Kruse, Albert Cohen:
Reduction Drawing: Language Constructs and Polyhedral Compilation for Reductions on GPU. 87-97 - Prashant Singh Rawat, Changwan Hong, Mahesh Ravishankar, Vinod Grover, Louis-Noël Pouchet, Atanas Rountev, P. Sadayappan:
Resource Conscious Reuse-Driven Tiling for GPUs. 99-111
Session 3: Best Paper
- Byungchul Hong, Gwangsun Kim, Jung Ho Ahn, Yongkee Kwon, Hongsik Kim, John Kim:
Accelerating Linked-list Traversal Through Near-Data Processing. 113-124 - Andi Drebes, Antoniu Pop, Karine Heydemann, Albert Cohen, Nathalie Drach:
Scalable Task Parallelism for NUMA: A Uniform Abstraction for Coordinated Scheduling and Memory Management. 125-137 - Shintaro Iwasaki, Kenjiro Taura:
A Static Cut-off for Task Parallel Programs. 139-150
Session 4: Keynote
- Yale N. Patt:
Greater Performance and Better Efficiency: Predicated Execution has shown us the way. 151
Session 5A: System Optimization I
- Sanyam Mehta, Josep Torrellas:
WearCore: A Core for Wearable Workloads. 153-164 - Sudarsun Kannan, Moinuddin K. Qureshi, Ada Gavrilovska, Karsten Schwan:
Energy Aware Persistence: Reducing Energy Overheads of Memory-based Persistence in NVMs. 165-177 - Neha Gholkar, Frank Mueller, Barry Rountree:
Power Tuning HPC Jobs on Power-Constrained Systems. 179-191 - Younghyun Cho, Surim Oh, Bernhard Egger:
Online Scalability Characterization of Data-Parallel Programs on Many Cores. 191-205
Session 5B: Parallel Software Optimization
- Jialu Huang, Prakash Prabhu, Thomas B. Jablin, Soumyadeep Ghosh, Sotiris Apostolakis, Jae W. Lee, David I. August:
Speculatively Exploiting Cross-Invocation Parallelism. 207-221 - Junqiao Qiu, Zhijia Zhao, Bin Ren:
MicroSpec: Speculation-Centric Fine-Grained Parallelization for FSM Computations. 221-233 - Dibakar Gope, Mikko H. Lipasti:
Hash Map Inlining. 235-246 - Hongbo Rong, Jongsoo Park, Lingxiang Xiang, Todd A. Anderson, Mikhail Smelyanskiy:
Sparso: Context-driven Optimizations of Sparse Linear Algebra. 247-259
Session 6A: Cache Coherence
- Xiangyao Yu, Hongzhe Liu, Ethan Zou, Srinivas Devadas:
Tardis 2.0: Optimized Time Traveling Coherence for Relaxed Consistency Models. 261-274 - Paul Caheny, Marc Casas, Miquel Moretó, Hervé Gloaguen, Maxime Saintes, Eduard Ayguadé, Jesús Labarta, Mateo Valero:
Reducing Cache Coherence Traffic with Hierarchical Directory Cache and NUMA-Aware Runtime Scheduling. 275-286
Session 6B: Memory Access Efficiency
- Yong Zhao, Jia Rao, Qing Yi:
Characterizing and Optimizing the Performance of Multithreaded Programs Under Interference. 287-297 - Vladimir Kiriansky, Yunming Zhang, Saman P. Amarasinghe:
Optimizing Indirect Memory References with milk. 299-312
Session 7: Keynote
- Kunle Olukotun:
Scaling Data Analytics with Moore's Law. 313
Session 8A: System Acceleration
- Mingcong Song, Yang Hu, Yunlong Xu, Chao Li, Huixiang Chen, Jingling Yuan, Tao Li:
Bridging the Semantic Gaps of GPU Acceleration for Scale-out CNN-based Big Data Processing: Think Big, See Small. 315-326 - Nitin Chugh, Vinay Vasista, Suresh Purini, Uday Bondhugula:
A DSL Compiler for Accelerating Image Processing Pipelines on FPGAs. 327-338 - Gwangsun Kim, Jiyun Jeong, John Kim, Mark Stephenson:
Automatically Exploiting Implicit Pipeline Parallelism from Multiple Dependent Kernels for GPUs. 341-352 - Yipeng Wang, Ren Wang, Andrew Herdrich, James Tsai, Yan Solihin:
CAF: Core to Core Communication Acceleration Framework. 351-362
Session 8B: System Optimization II
- Andrew Anderson, David Gregg:
Vectorization of Multibyte Floating Point Data Formats. 363-372 - Sankaralingam Panneerselvam, Michael M. Swift:
Rinnegan: Efficient Resource Use in Heterogeneous Architectures. 373-386 - Zhen Jia, Chao Xue, Guancheng Chen, Jianfeng Zhan, Lixin Zhang, Yonghua Lin, H. Peter Hofstee:
Auto-tuning Spark Big Data Workloads on POWER8: Prediction-Based Dynamic SMT Threading. 387-400 - Heiner Litz, Benjamin Braun, David R. Cheriton:
EXCITE-VM: Extending the Virtual Memory System to Support Snapshot Isolation Transactions. 401-412
Poster Presentations
- Rahul Boyapati, Jiayi Huang, Ningyuan Wang, Kyung Hoon Kim, Ki Hwan Yum, Eun Jung Kim:
POSTER: Fly-Over: A Light-Weight Distributed Power-Gating Mechanism For Energy-Efficient Networks-on-Chip. 413-414 - Kallia Chronaki, Miquel Moretó, Marc Casas, Alejandro Rico, Rosa M. Badia, Eduard Ayguadé, Jesús Labarta, Mateo Valero:
POSTER: Exploiting Asymmetric Multi-Core Processors with Flexible System Sofware. 415-417 - Fady Ghanim, Rajeev Barua, Uzi Vishkin:
POSTER: Easy PRAM-based High-Performance Parallel Programming with ICE. 419-420 - Florian Haas, Sebastian Weis, Theo Ungerer, Gilles Pokam, Youfeng Wu:
POSTER: Fault-tolerant Execution on COTS Multi-core Processors with Hardware Transactional Memory Support. 421-422 - Guray Ozen, Eduard Ayguadé, Jesús Labarta:
POSTER: Collective Dynamic Parallelism for Directive Based GPU Programming Languages and Compilers. 423-424 - Sankaralingam Panneerselvam, Michael M. Swift:
POSTER: Firestorm: Operating Systems for Power-Constrained Architectures. 425-427 - Miquel Pericàs:
POSTER: ξ-TAO: A Cache-centric Execution Model and Runtime for Deep Parallel Multicore Topologies. 429-431 - Alberto Ros, Carl Leonardsson, Christos Sakalis, Stefanos Kaxiras:
POSTER: Efficient Self-Invalidation/Self-Downgrade for Critical Sections with Relaxed Semantics. 433-434 - Jee Ho Ryoo, Mitesh R. Meswani, Reena Panda, Lizy K. John:
POSTER: SILC-FM: Subblocked InterLeaved Cache-Like Flat Memory Organization. 435-437 - Diogo Nunes Sampaio, Alain Ketterlin, Louis-Noël Pouchet, Fabrice Rastello:
POSTER: Hybrid Data Dependence Analysis for Loop Transformations. 439-440 - Xiaowei Shen, Xiaochun Ye, Xu Tan, Da Wang, Zhimin Zhang, Dongrui Fan, Zhimin Tang:
POSTER: An Optimization of Dataflow Architectures for Scientific Applications. 441-442 - Prakalp Srivastava, Maria Kotsifakou, Matthew D. Sinclair, Rakesh Komuravelli, Vikram S. Adve, Sarita V. Adve:
POSTER: hVISC: A Portable Abstraction for Heterogeneous Parallel Systems. 443-445 - Milan Stanic, Oscar Palomar, Timothy Hayes, Ivan Ratkovic, Osman S. Unsal, Adrián Cristal, Mateo Valero:
POSTER: An Integrated Vector-Scalar Design on an In-order ARM Core. 447-448 - Tsung Tai Yeh, Amit Sabne, Putt Sakdhnagool, Rudolf Eigenmann, Timothy G. Rogers:
POSTER: Pagoda: A Runtime System to Maximize GPU Utilization in Data Parallel Tasks with Limited Parallelism. 449-450
Student Research Poster Presentations
- Saumay Dublish:
Student Research Poster: Slack-Aware Shared Bandwidth Management in GPUs. 451-452 - Roman Kaplan:
Student Research Poster: From Processing-in-Memory to Processing-in-Storage. 453 - Arthur Kiyanovski:
Student Research Poster: Network Controller Emulation on a Sidecore for Unmodified Virtual Machines. 454 - Vicent Selfa, Julio Sahuquillo, Salvador Petit, María Engracia Gómez:
Student Research Poster: A Low Complexity Cache Sharing Mechanism to Address System Fairness. 455 - Jiawen Sun:
Student Research Poster: A Scalable General Purpose System for Large-Scale Graph Processing. 456 - Vladislav Tartakovsky:
Student Research Poster: Compiling Boolean Circuits to Non-deterministic Branching Programs to be Implemented by Light Switching Circuits. 457 - Kim-Anh Tran:
Student Research Poster: Software Out-of-Order Execution for In-Order Architectures. 458
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.