default search action
22nd ICS 2008: Island of Kos, Greece
- Pin Zhou:
Proceedings of the 22nd Annual International Conference on Supercomputing, ICS 2008, Island of Kos, Greece, June 7-12, 2008. ACM 2008, ISBN 978-1-60558-158-3 - Mark J. Harris:
Many-core GPU computing with NVIDIA CUDA. 1 - Tilak Agerwala:
Challenges on the road to exascale computing. 2 - David E. Keyes:
Petaflop/s, seriously. 3
Algorithms & applications 1
- Khaled Z. Ibrahim, François Bodin:
Implementing Wilson-Dirac operator on the cell broadband engine. 4-14 - Timothy D. R. Hartley, Ümit V. Çatalyürek, Antonio Ruiz, Francisco D. Igual, Rafael Mayo, Manuel Ujaldon:
Biomedical image analysis on a cooperative cluster of GPUs and multicores. 15-25 - Gregory Buehrer, Srinivasan Parthasarathy, Matthew Goyder:
Data mining on the cell broadband engine. 26-35
Performance evaluation 1
- Jonathan Weinberg, Allan Snavely:
Accurate memory signatures and synthetic address traces for HPC applications. 36-45 - Prasun Ratn, Frank Mueller, Bronis R. de Supinski, Martin Schulz:
Preserving time in large-scale communication traces. 46-55
Architecture 1
- Michel N. Victor, Aris K. Silzars, Edward S. Davidson:
A freespace crossbar for multi-core processors. 56-62 - Song Liu, Seda Ogrenci Memik, Yu Zhang, Gokhan Memik:
An approach for adaptive DRAM temperature and power management. 63-72 - Jeffery A. Brown, Dean M. Tullsen:
The shared-thread multiprocessor. 73-82
Communication & synchronization 1
- Qasim Ali, Vijay S. Pai, Samuel P. Midkiff:
Advanced collective communication in aspen. 83-93 - Sameer Kumar, Gábor Dózsa, Gheorghe Almási, Philip Heidelberger, Dong Chen, Mark Giampapa, Michael Blocksome, Ahmad Faraj, Jeff Parker, Joe Ratterman, Brian E. Smith, Charles Archer:
The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer. 94-103 - Brian S. White, Sally A. McKee, Daniel J. Quinlan:
A projection-based optimization framework for abstractions with application to the unstructured mesh domain. 104-113
File systems
- Ruini Xue, Wenguang Chen, Weimin Zheng:
CprFS: a user-level file system to support consistent file states for checkpoint and restart. 114-123 - Henry M. Monti, Ali Raza Butt, Sudharshan S. Vazhkudai:
Timely offloading of result-data in HPC centers. 124-133 - Huijun Zhu, Peng Gu, Jun Wang:
Shifted declustering: a placement-ideal layout scheme for multi-way replication storage architecture. 134-144
Fault tolerance
- Matthew J. Koop, Rahul Kumar, Dhabaleswar K. Panda:
Can software reliability outperform hardware reliability on high performance interconnects?: a case study with MPI over infiniband. 145-154 - Greg Bronevetsky, Bronis R. de Supinski:
Soft error vulnerability of iterative linear algebra methods. 155-164
Operating systems
- Edi Shmueli, George Almási, José R. Brunheroto, José G. Castaños, Gábor Dózsa, Sameer Kumar, Derek Lieber:
Evaluating the effect of replacing CNK with linux on the compute-nodes of blue gene/l. 165-174 - Akshat Verma, Puneet Ahuja, Anindya Neogi:
Power-aware dynamic placement of HPC applications. 175-184 - Hyung Won Choi, Hukeun Kwak, Andrew Sohn, Kyusik Chung:
Autonomous learning for efficient resource utilization of dynamic VM migration. 185-194
Algorithms & applications 2
- Seyong Lee, Rudolf Eigenmann:
Adaptive runtime tuning of parallel sparse matrix-vector multiplication on distributed memory systems. 195-204 - Yuri Dotsenko, Naga K. Govindaraju, Peter-Pike J. Sloan, Charles Boyd, John Manferdelli:
Fast scan algorithms on graphics processors. 205-213 - Andrey N. Chernikov, Nikos Chrisochoides:
Three-dimensional delaunay refinement for multi-core processors. 214-224
Code performance tuning
- Muthu Manikandan Baskaran, Uday Bondhugula, Sriram Krishnamoorthy, J. Ramanujam, Atanas Rountev, P. Sadayappan:
A compiler framework for optimization of affine loop nests for gpgpus. 225-234 - Suhyun Kim, Soo-Mook Moon:
Rotating register allocation with multiple rotating branches. 235-244 - Yixin Shou, Robert A. van Engelen:
Automatic SIMD vectorization of chains of recurrences. 245-255
Communication & Synchronization 2
- Seung-Jai Min, Rudolf Eigenmann:
Optimizing irregular shared-memory applications for clusters. 256-265 - Costin Iancu, Wei Chen, Katherine A. Yelick:
Performance portable optimizations for loops containing communication operations. 266-276 - Jun Shirako, David M. Peixotto, Vivek Sarkar, William N. Scherer III:
Phasers: a unified deadlock-free construct for collective and point-to-point synchronization. 277-288
Memory management
- Tong Chen, Haibo Lin, Tao Zhang:
Orchestrating data transfer for the cell/B.E. processor. 289-298 - Isaac Gelado, John H. Kelm, Shane Ryoo, Steven S. Lumetta, Nacho Navarro, Wen-mei W. Hwu:
CUBA: an architecture for efficient CPU/co-processor data communication. 299-308 - Mark Silberstein, Assaf Schuster, Dan Geiger, Anjul Patney, John D. Owens:
Efficient computation of sum-products on GPUs through software-managed cache. 309-318
Architecture 2
- Fang Lu, Lei Wang, Xiaobing Feng, Zhiyuan Li, Zhaoqing Zhang:
Exploiting idle register classes for fast spill destination. 319-326 - William Lloyd Bircher, Lizy K. John:
Analysis of dynamic power management on multi-core processors. 327-338 - R. Manikantan, R. Govindarajan:
Focused prefetching: performance oriented prefetching based on commit stalls. 339-348
Performance evaluation 2
- Marc Casas, Rosa M. Badia, Jesús Labarta:
Automatic analysis of speedup of MPI applications. 349-358 - Lixia Liu, Zhiyuan Li, Ahmed H. Sameh:
Analyzing memory access intensity in parallel programs on multicore. 359-367 - Bradley J. Barnes, Barry Rountree, David K. Lowenthal, Jaxk Reeves, Bronis R. de Supinski, Martin Schulz:
A regression-based approach to scalability prediction. 368-377
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.