default search action
ACM Transactions on Architecture and Code Optimization, Volume 11
Volume 11, Number 1, February 2014
- Neeraj Goel, Anshul Kumar, Preeti Ranjan Panda:
Shared-port register file architecture for low-energy VLIW processors. 1:1-1:32 - Zheng Wang, Georgios Tournavitis, Björn Franke, Michael F. P. O'Boyle:
Integrating profile-driven parallelism detection and machine-learning-based mapping. 2:1-2:26 - Mehrzad Samadi, Amir Hormati, Janghaeng Lee, Scott A. Mahlke:
Leveraging GPUs using cooperative loop speculation. 3:1-3:26 - Jue Wang, Xiangyu Dong, Yuan Xie, Norman P. Jouppi:
Endurance-aware cache line management for non-volatile caches. 4:1-4:25 - Lei Liu, Zehan Cui, Yong Li, Yungang Bao, Mingyu Chen, Chengyong Wu:
BPM/BPM+: Software-based dynamic memory partitioning mechanisms for mitigating DRAM bank-/channel-level interferences in multicore systems. 5:1-5:28 - Christian Häubl, Christian Wimmer, Hanspeter Mössenböck:
Trace transitioning and exception handling in a trace-based JIT compiler for java. 6:1-6:26 - Yongbing Huang, Licheng Chen, Zehan Cui, Yuan Ruan, Yungang Bao, Mingyu Chen, Ninghui Sun:
HMTT: A hybrid hardware/software tracing system for bridging the DRAM access trace's semantic gap. 7:1-7:25 - Quan Chen, Minyi Guo:
Adaptive workload-aware task scheduling for single-ISA asymmetric multicore architectures. 8:1-8:25 - Gülfem Savrun-Yeniçeri, Wei Zhang, Huahan Zhang, Eric Seckler, Chen Li, Stefan Brunthaler, Per Larsen, Michael Franz:
Efficient hosted interpreters on the JVM. 9:1-9:24 - Prashant J. Nair, Chia-Chen Chou, Moinuddin K. Qureshi:
Refresh pausing in DRAM memory systems. 10:1-10:26 - Komal Jothi, Haitham Akkary:
Tuning the continual flow pipeline architecture with virtual register renaming. 11:1-11:27 - Thomas Carle, Dumitru Potop-Butucaru:
Predicate-aware, makespan-preserving software pipelining of scheduling tables. 12:1-12:26 - Angeliki Kritikakou, Francky Catthoor, Vasilios I. Kelefouras, Costas E. Goutis:
A scalable and near-optimal representation of access schemes for memory management. 13:1-13:25 - Hugh Leather, Edwin V. Bonilla, Michael F. P. O'Boyle:
Automatic feature generation for machine learning-based optimising compilation. 14:1-14:32
Volume 11, Number 2, June 2014
- Theo Kluter, Samuel Burri, Philip Brisk, Edoardo Charbon, Paolo Ienne:
Virtual Ways: Low-Cost Coherence for Instruction Set Extensions with Architecturally Visible Storage. 15:1-15:26 - Bin Ren, Todd Mytkowicz, Gagan Agrawal:
A Portable Optimization Engine for Accelerating Irregular Data-Traversal Applications on SIMD Architectures. 16:1-16:31 - Zhengwei Qi, Jianguo Yao, Chao Zhang, Miao Yu, Zhizhou Yang, Haibing Guan:
VGRIS: Virtualized GPU Resource Isolation and Scheduling in Cloud Gaming. 17:1-17:25 - Bor-Yeh Shen, Wei-Chung Hsu, Wuu Yang:
A Retargetable Static Binary Translator for the ARM Architecture. 18:1-18:25 - Darío Suárez Gracia, Alexandra Ferrerón-Labari, Luis Montesano Del Campo, Teresa Monreal Arnal, Víctor Viñals Yúfera:
Revisiting LP-NUCA Energy Consumption: Cache Access Policies and Adaptive Block Dropping. 19:1-19:26 - Zhibin Liang, Wei Zhang, Yung-Cheng Ma:
Deadline-Constrained Clustered Scheduling for VLIW Architectures using Power-Gated Register Files. 20:1-20:26 - Shuangde Fang, Zidong Du, Yuntan Fang, Yuanjie Huang, Yang Chen, Lieven Eeckhout, Olivier Temam, Huawei Li, Yunji Chen, Chengyong Wu:
Performance Portability Across Heterogeneous SoCs Using a Generalized Library-Based Approach. 21:1-21:25 - Abdul Rahman Kaitoua, Hazem M. Hajj, Mazen A. R. Saghir, Hassan Artail, Haitham Akkary, Mariette Awad, Mageda Sharafeddine, Khaleel W. Mershad:
Hadoop Extensions for Distributed Computing on Reconfigurable Active SSD Clusters. 22:1-22:26
Volume 11, Number 3, July/August 2014
- Jue Wang, Xiangyu Dong, Yuan Xie:
Preventing STT-RAM Last-Level Caches from Port Obstruction. 23:1-23:19 - Miguel A. Gonzalez-Mesa, Eladio Gutiérrez, Emilio L. Zapata, Oscar G. Plata:
Effective Transactional Memory Execution Management for Improved Concurrency. 24:1-24:27 - Rakesh Kumar, Alejandro Martínez, Antonio González:
Efficient Power Gating of SIMD Accelerators Through Dynamic Selective Devectorization in an HW/SW Codesigned Environment. 25:1-25:23 - Stefano Di Carlo, Salvatore Galfano, Marco Indaco, Paolo Prinetto, Davide Bertozzi, Piero Olivo, Cristian Zambelli:
FLARES: An Aging Aware Algorithm to Autonomously Adapt the Error Correction Capability in NAND Flash Memories. 26:1-26:25 - Davide B. Bartolini, Filippo Sironi, Donatella Sciuto, Marco D. Santambrogio:
Automated Fine-Grained CPU Provisioning for Virtual Machines. 27:1-27:25 - Trevor E. Carlson, Wim Heirman, Stijn Eyerman, Ibrahim Hur, Lieven Eeckhout:
An Evaluation of High-Level Mechanistic Core Models. 28:1-28:25 - Farrukh Hijaz, Omer Khan:
NUCA-L1: A Non-Uniform Access Latency Level-1 Cache Architecture for Multicores Operating at Near-Threshold Voltages. 29:1-29:28 - Andi Drebes, Karine Heydemann, Nathalie Drach, Antoniu Pop, Albert Cohen:
Topology-Aware and Dependence-Aware Scheduling and Memory Allocation for Task-Parallel Languages. 30:1-30:25 - Venkata Kalyan Tavva, Ravi Kasha, Madhu Mutyam:
EFGR: An Enhanced Fine Granularity Refresh Feature for High-Performance DDR4 DRAM Devices. 31:1-31:26 - Gulay Yalcin, Oguz Ergin, Emrah Islek, Osman Sabri Unsal, Adrián Cristal:
Exploiting Existing Comparators for Fine-Grained Low-Cost Error Detection. 32:1-32:24 - Pradeep Ramachandran, Siva Kumar Sastry Hari, Man-Lap Li, Sarita V. Adve:
Hardware Fault Recovery for I/O Intensive Applications. 33:1-33:25 - Stijn Eyerman, Pierre Michaud, Wouter Rogiest:
Multiprogram Throughput Metrics: A Systematic Approach. 34:1-34:26
Volume 11, Number 4, December 2014
- Cedric Nugteren, Henk Corporaal:
Bones: An Automatic Skeleton-Based C-to-CUDA Compiler for GPUs. 35:1-35:25 - Jue Wang, Xiangyu Dong, Yuan Xie:
Building and Optimizing MRAM-Based Commodity Memories. 36:1-36:22 - Rakesh Komuravelli, Sarita V. Adve, Ching-Tsun Chou:
Revisiting the Complexity of Hardware Cache Coherence and Some Implications. 37:1-37:22 - Gabriel Rodríguez, Juan Touriño, Mahmut T. Kandemir:
Volatile STT-RAM Scratchpad Design and Data Allocation for Low Energy. 38:1-38:26 - Cristobal Camarero, Enrique Vallejo, Ramón Beivide:
Topological Characterization of Hamming and Dragonfly Networks and Its Implications on Routing. 39:1-39:25 - HanBin Yoon, Justin Meza, Naveen Muralimanohar, Norman P. Jouppi, Onur Mutlu:
Efficient Data Mapping and Buffering Techniques for Multilevel Cell Phase-Change Memories. 40:1-40:25 - Nathanaël Prémillieu, André Seznec:
Efficient Out-of-Order Execution of Guarded ISAs. 41:1-41:21 - Zheng Wang, Dominik Grewe, Michael F. P. O'Boyle:
Automatic and Portable Mapping of Data Parallel Programs to OpenCL for GPU-Based Heterogeneous Systems. 42:1-42:26 - Dan He, Fang Wang, Hong Jiang, Dan Feng, Jingning Liu, Wei Tong, Zheng Zhang:
Improving Hybrid FTL by Fully Exploiting Internal SSD Parallelism with Virtual Blocks. 43:1-43:19 - Eri Rubin, Ely Levy, Amnon Barak, Tal Ben-Nun:
MAPS: Optimizing Massively Parallel Applications Using Device-Level Memory Abstraction. 44:1-44:22
Volume 11, Number 4, January 2015
- Alessandro Cilardo, Luca Gallo:
Improving Multibank Memory Access Parallelism with Lattice-Based Partitioning. 45:1-45:25 - Jan Kasper Martinsen, Håkan Grahn, Anders Isberg:
The Effects of Parameter Tuning in Software Thread-Level Speculation in JavaScript Engines. 46:1-46:25 - Quentin Colombet, Florian Brandner, Alain Darte:
Studying Optimal Spilling in the Light of SSA. 47:1-47:26 - Jawad Haj-Yihia, Yosi Ben-Asher, Efraim Rotem, Ahmad Yasin, Ran Ginosar:
Compiler-Directed Power Management for Superscalars. 48:1-48:21 - Hong-Phuc Trinh, Marc Duranton, Michel Paindavoine:
Efficient Data Encoding for Convolutional Neural Network application. 49:1-49:21 - Maximilien Breughe, Stijn Eyerman, Lieven Eeckhout:
Mechanistic Analytical Modeling of Superscalar In-Order Processor Performance. 50:1-50:26 - Vivek Seshadri, Samihan Yedkar, Hongyi Xin, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry:
Mitigating Prefetcher-Caused Pollution Using Informed Caching Policies for Prefetched Blocks. 51:1-51:22 - George Matheou, Paraskevas Evripidou:
Architectural Support for Data-Driven Execution. 52:1-52:25 - Amir Morad, Leonid Yavits, Ran Ginosar:
GP-SIMD Processing-in-Memory. 53:1-53:26 - Thomas Schaub, Simon Moll, Ralf Karrenberg, Sebastian Hack:
The Impact of the SIMD Width on Control-Flow and Memory Divergence. 54:1-54:25 - Zhenman Fang, Sanyam Mehta, Pen-Chung Yew, Antonia Zhai, James B. S. G. Greensky, Gautham Beeraka, Binyu Zang:
Measuring Microarchitectural Details of Multi- and Many-Core Memory Systems through Microbenchmarking. 55:1-55:26 - Chi Ching Chi, Mauricio Alvarez-Mesa, Ben H. H. Juurlink:
Low-Power High-Efficiency Video Decoding using General-Purpose Processors. 56:1-56:25 - Fabio Luporini, Ana Lucia Varbanescu, Florian Rathgeber, Gheorghe-Teodor Bercea, J. Ramanujam, David A. Ham, Paul H. J. Kelly:
Cross-Loop Optimization of Arithmetic Intensity for Finite Element Local Assembly. 57:1-57:25 - Xing Zhou, María Jesús Garzarán, David A. Padua:
Optimal Parallelogram Selection for Hierarchical Tiling. 58:1-58:23 - Leo Porter, Michael A. Laurenzano, Ananta Tiwari, Adam Jundt, William A. Ward Jr., Roy L. Campbell, Laura Carrington:
Making the Most of SMT in HPC: System- and Application-Level Perspectives. 59:1-59:26 - Xin Tong, Toshihiko Koju, Motohiro Kawahito, Andreas Moshovos:
Optimizing Memory Translation Emulation in Full System Emulators. 60:1-60:24 - Martin Kong, Antoniu Pop, Louis-Noël Pouchet, R. Govindarajan, Albert Cohen, P. Sadayappan:
Compiler/Runtime Framework for Dynamic Dataflow Parallelization of Tiled Programs. 61:1-61:30 - Nicolas Melot, Christoph W. Keßler, Jörg Keller, Patrick Eitschberger:
Fast Crown Scheduling Heuristics for Energy-Efficient Mapping and Scaling of Moldable Streaming Tasks on Manycore Systems. 62:1-62:24 - Wenjia Ruan, Yujie Liu, Michael F. Spear:
Transactional Read-Modify-Write Without Aborts. 63:1-63:24 - Zia Ul Huda, Ali Jannesari, Felix Wolf:
Using Template Matching to Infer Parallel Design Patterns. 64:1-64:21 - Heiner Litz, Ricardo J. Dias, David R. Cheriton:
Efficient Correction of Anomalies in Snapshot Isolation Transactions. 65:1-65:24 - Helge Bahmann, Nico Reissmann, Magnus Jahre, Jan Christian Meyer:
Perfect Reconstructability of Control Flow from Demand Dependence Graphs. 66:1-66:25 - Venmugil Elango, Naser Sedaghati, Fabrice Rastello, Louis-Noël Pouchet, J. Ramanujam, Radu Teodorescu, P. Sadayappan:
On Using the Roofline Model with Lower Bounds on Data Movement. 67:1-67:23
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.