default search action
ACM Transactions on Architecture and Code Optimization, Volume 14
Volume 14, Number 1, April 2017
- Lev Mukhanov, Pavlos Petoumenos, Zheng Wang, Konstantinos Parasyris, Dimitrios S. Nikolopoulos, Bronis R. de Supinski, Hugh Leather:
ALEA: A Fine-Grained Energy Profiling Tool. 1:1-1:25 - Anuj Pathania, Vanchinathan Venkataramani, Muhammad Shafique, Tulika Mitra, Jörg Henkel:
Defragmentation of Tasks in Many-Core Architecture. 2:1-2:21 - Darko Zivanovic, Milan Pavlovic, Milan Radulovic, Hyunsung Shin, Jongpil Son, Sally A. McKee, Paul M. Carpenter, Petar Radojkovic, Eduard Ayguadé:
Main Memory in HPC: Do We Need More or Could We Live with Less? 3:1-3:26 - Wenguang Zheng, Hui Wu, Qing Yang:
WCET-Aware Dynamic I-Cache Locking for a Single Task. 4:1-4:26 - Byung-Sun Yang, Jae-Yun Kim, Soo-Mook Moon:
Exceptionization: A Java VM Optimization for Non-Java Languages. 5:1-5:25 - Rathijit Sen, David A. Wood:
Pareto Governors for Energy-Optimal Computing. 6:1-6:25 - Mainak Chaudhuri, Mukesh Agrawal, Jayesh Gaur, Sreenivas Subramoney:
Micro-Sector Cache: Improving Space Utilization in Sectored DRAM Caches. 7:1-7:29 - Kyriakos Georgiou, Steve Kerrison, Zbigniew Chamski, Kerstin Eder:
Energy Transparency for Deeply Embedded Programs. 8:1-8:26 - Pengcheng Li, Xiaoyu Hu, Dong Chen, Jacob Brock, Hao Luo, Eddy Z. Zhang, Chen Ding:
LD: Low-Overhead GPU Race Detection Without Access Monitoring. 9:1-9:25 - Poovaiah M. Palangappa, Kartik Mohanram:
CompEx++: Compression-Expansion Coding for Energy, Latency, and Lifetime Improvements in MLC/TLC NVMs. 10:1-10:30
Volume 14, Number 2, July 2017
- Dongwoo Lee, Sang-Heon Lee, Soojung Ryu, Kiyoung Choi:
Dirty-Block Tracking in a Direct-Mapped DRAM Cache with Self-Balancing Dispatch. 11:1-11:25 - Konstantinos Parasyris, Vassilis Vassiliadis, Christos D. Antonopoulos, Spyros Lalis, Nikolaos Bellas:
Significance-Aware Program Execution on Unreliable Hardware. 12:1-12:25 - Gleison Souza Diniz Mendonca, Breno Campos Ferreira Guimarães, Péricles Alves, Márcio Machado Pereira, Guido Araujo, Fernando Magno Quintão Pereira:
DawnCC: Automatic Annotation for Data Parallelism and Offloading. 13:1-13:25 - Rajeev Balasubramonian, Andrew B. Kahng, Naveen Muralimanohar, Ali Shafiee, Vaishnav Srinivas:
CACTI 7: New Tools for Interconnect Exploration in Innovative Off-Chip Memories. 14:1-14:25 - Vishwesh Jatala, Jayvant Anantpur, Amey Karkare:
Scratchpad Sharing in GPUs. 15:1-15:29 - Tae Jun Ham, Juan L. Aragón, Margaret Martonosi:
Decoupling Data Supply from Computation for Latency-Tolerant Communication in Heterogeneous Architectures. 16:1-16:27 - Milan Stanic, Oscar Palomar, Timothy Hayes, Ivan Ratkovic, Adrián Cristal, Osman S. Unsal, Mateo Valero:
An Integrated Vector-Scalar Design on an In-Order ARM Core. 17:1-17:26 - Fernando A. Endo, Arthur Perais, André Seznec:
On the Interactions Between Value Prediction and Compiler Optimizations in the Context of EOLE. 18:1-18:24 - Aswinkumar Sridharan, Biswabandan Panda, André Seznec:
Band-Pass Prefetching: An Effective Prefetch Management Mechanism Using Prefetch-Fraction Metric in Multi-Core Systems. 19:1-19:27 - Andrés Goens, Sergio Siccha, Jerónimo Castrillón:
Symmetry in Software Synthesis. 20:1-20:26
Volume 14, Number 3, September 2017
- Sander Vocke, Henk Corporaal, Roel Jordans, Rosilde Corvino, Rick J. M. Nas:
Extending Halide to Improve Software Development for Imaging DSPs. 21:1-21:25 - Nicklas Bo Jensen, Sven Karlsson:
Improving Loop Dependence Analysis. 22:1-22:24 - Stefan Ganser, Armin Größlinger, Norbert Siegmund, Sven Apel, Christian Lengauer:
Iterative Schedule Optimization for Parallelization in the Polyhedron Model. 23:1-23:26 - Wei Wei, Dejun Jiang, Jin Xiong, Mingyu Chen:
HAP: Hybrid-Memory-Aware Partition in Shared Last-Level Cache. 24:1-24:25 - Dongliang Xiong, Kai Huang, Xiaowen Jiang, Xiaolang Yan:
Providing Predictable Performance via a Slowdown Estimation Model. 25:1-25:26 - Jing Pu, Steven Bell, Xuan Yang, Jeff Setter, Stephen Richardson, Jonathan Ragan-Kelley, Mark Horowitz:
Programming Heterogeneous Systems from an Image Processing DSL. 26:1-26:25 - Ayman Hroub, Muhammad E. S. Elrabaa, Muhamed F. Mudawar, Ahmad Khayyat:
Efficient Generation of Compact Execution Traces for Multicore Architectural Simulations. 27:1-27:25 - Nicolas Weber, Michael Goesele:
MATOG: Array Layout Auto-Tuning for CUDA. 28:1-28:26 - Amir Hossein Ashouri, Andrea Bignoli, Gianluca Palermo, Cristina Silvano, Sameer Kulkarni, John Cavazos:
MiCOMP: Mitigating the Compiler Phase-Ordering Problem Using Optimization Sub-Sequences and Machine Learning. 29:1-29:28 - Erik Vermij, Leandro Fiorin, Rik Jongerius, Christoph Hagleitner, Jan van Lunteren, Koen Bertels:
An Architecture for Integrated Near-Data Processors. 30:1-30:25 - Andreas Diavastos, Pedro Trancoso:
SWITCHES: A Lightweight Runtime for Dataflow Execution of Tasks on Many-Cores. 31:1-31:23
Volume 14, Number 4, December 2017
- Rahul Jain, Preeti Ranjan Panda, Sreenivas Subramoney:
Cooperative Multi-Agent Reinforcement Learning-Based Co-optimization of Cores, Caches, and On-chip Network. 32:1-32:25 - Daniele De Sensi, Tiziano De Matteis, Massimo Torquati, Gabriele Mencagli, Marco Danelutto:
Bringing Parallel Patterns Out of the Corner: The P3 ARSEC Benchmark Suite. 33:1-33:26 - Chencheng Ye, Chen Ding, Hao Luo, Jacob Brock, Dong Chen, Hai Jin:
Cache Exclusivity and Sharing: Theory and Optimization. 34:1-34:26 - Rahul Shrivastava, V. Krishna Nandivada:
Energy-Efficient Compilation of Irregular Task-Parallel Loops. 35:1-35:29 - Julien Proy, Karine Heydemann, Alexandre Berzati, Albert Cohen:
Compiler-Assisted Loop Hardening Against Fault Attacks. 36:1-36:25 - Christina L. Peterson, Damian Dechev:
A Transactional Correctness Tool for Abstract Data Types. 37:1-37:24 - Matteo Ferroni, Andrea Corna, Andrea Damiani, Rolando Brondolin, Juan A. Colmenares, Steven A. Hofmeyr, John Kubiatowicz, Marco D. Santambrogio:
Power Consumption Models for Multi-Tenant Server Infrastructures. 38:1-38:22 - Milad Mohammadi, Tor M. Aamodt, William J. Dally:
CG-OoO: Energy-Efficient Coarse-Grain Out-of-Order Execution Near In-Order Energy with Near Out-of-Order Performance. 39:1-39:26 - Shivam Swami, Poovaiah M. Palangappa, Kartik Mohanram:
ECS: Error-Correcting Strings for Lifetime Improvements in Nonvolatile Memories. 40:1-40:29 - Muhammad Waqar Azhar, Per Stenström, Vassilis Papaefstathiou:
SLOOP: QoS-Supervised Loop Execution to Reduce Energy on Heterogeneous Architectures. 41:1-41:25 - Kanakagiri Raghavendra, Biswabandan Panda, Madhu Mutyam:
MBZip: Multiblock Data Compression. 42:1-42:29 - Richard Neill, Andi Drebes, Antoniu Pop:
Fuse: Accurate Multiplexing of Hardware Performance Counters Across Executions. 43:1-43:26 - Somayeh Sardashti, David A. Wood:
Could Compression Be of General Use? Evaluating Memory Compression across Domains. 44:1-44:24 - Libo Huang, Ya-Shuai Lü, Li Shen, Zhiying Wang:
Improving the Efficiency of GPGPU Work-Queue Through Data Awareness. 45:1-45:22 - Alexandra Angerd, Erik Sintorn, Per Stenström:
A Framework for Automated and Controlled Floating-Point Accuracy Reduction in Graphics Applications on GPUs. 46:1-46:25 - Jaime Arteaga, Stéphane Zuckerman, Guang R. Gao:
Generating Fine-Grain Multithreaded Applications Using a Multigrain Approach. 47:1-47:26 - Ramyad Hadidi, Lifeng Nai, Hyojong Kim, Hyesoon Kim:
CAIRO: A Compiler-Assisted Technique for Enabling Instruction-Level Offloading of Processing-In-Memory. 48:1-48:25 - Hongyeol Lim, Giho Park:
Triple Engine Processor (TEP): A Heterogeneous Near-Memory Processor for Diverse Kernel Operations. 49:1-49:25 - George Patsilaras, James Tuck:
ReDirect: Reconfigurable Directories for Multicore Architectures. 50:1-50:23 - Adarsh Patil, Ramaswamy Govindarajan:
HAShCache: Heterogeneity-Aware Shared DRAMCache for Integrated Heterogeneous Systems. 51:1-51:26 - Christophe Alias, Alexandru Plesco:
Optimizing Affine Control With Semantic Factorizations. 52:1-52:22 - George Matheou, Paraskevas Evripidou:
Data-Driven Concurrency for High Performance Computing. 53:1-53:26 - Giorgis Georgakoudis, Hans Vandierendonck, Peter Thoman, Bronis R. de Supinski, Thomas Fahringer, Dimitrios S. Nikolopoulos:
SCALO: Scalability-Aware Parallelism Orchestration for Multi-Threaded Workloads. 54:1-54:25 - Toufik Baroudi, Rachid Seghir, Vincent Loechner:
Optimization of Triangular and Banded Matrix Operations Using 2d-Packed Layouts. 55:1-55:19
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.