default search action
SC 2011: Seattle, WA, USA
- Scott A. Lathrop, Jim Costa, William Kramer:
Conference on High Performance Computing Networking, Storage and Analysis, SC 2011, Seattle, WA, USA, November 12-18, 2011. ACM 2011, ISBN 978-1-4503-0771-0
ACM Gordon Bell finalist
- Yukihiro Hasegawa, Jun-ichi Iwata, Miwako Tsuji, Daisuke Takahashi, Atsushi Oshiyama, Kazuo Minami, Taisuke Boku, Fumiyoshi Shoji, Atsuya Uno, Motoyoshi Kurokawa, Hikaru Inoue, Ikuo Miyoshi, Mitsuo Yokokawa:
First-principles calculations of electron states of a silicon nanowire with 100, 000 atoms on the K computer. 1:1-1:11 - Mathieu Luisier, Timothy B. Boykin, Gerhard Klimeck, Wolfgang Fichtner:
Atomistic nanoelectronic device engineering with sustained performances up to 1.44 PFlop/s. 2:1-2:11 - Takashi Shimokawabe, Takayuki Aoki, Tomohiro Takaki, Toshio Endo, Akinori Yamanaka, Naoya Maruyama, Akira Nukada, Satoshi Matsuoka:
Peta-scale phase-field simulation for dendritic solidification on the TSUBAME 2.0 supercomputer. 3:1-3:11 - Massimo Bernaschi, Mauro Bisson, Toshio Endo, Satoshi Matsuoka, Massimiliano Fatica, Simone Melchionna:
Petaflop biofluidics simulations on a two million-core system. 4:1-4:12 - Leopold Grinberg, Joseph A. Insley, Vitali A. Morozov, Michael E. Papka, George E. Karniadakis, Dmitry A. Fedosov, Kalyan Kumaran:
A new computational paradigm in multiscale simulations: application to brain blood flow. 5:1-5:5
Dense linear algebra
- Rajib Nath, Stanimire Tomov, Tingxing Dong, Jack J. Dongarra:
Optimizing symmetric dense matrix-vector multiplication on GPUs. 6:1-6:10 - Henricus Bouwmeester, Mathias Jacquelin, Julien Langou, Yves Robert:
Tiled QR factorization algorithms. 7:1-7:11 - Azzam Haidar, Hatem Ltaief, Jack J. Dongarra:
Parallel reduction to condensed forms for symmetric eigenvalue problems using aggregated fine-grained and memory-aware kernels. 8:1-8:11
Domain specific languages
- Zach DeVito, Niels Joubert, Francisco Palacios, Stephen Oakley, Montserrat Medina, Mike Barrientos, Erich Elsen, Frank Ham, Alex Aiken, Karthik Duraisamy, Eric Darve, Juan J. Alonso, Pat Hanrahan:
Liszt: a domain specific language for building portable mesh-based PDE solvers. 9:1-9:12 - Wesley Kendall, Jingyuan Wang, Melissa R. Allen, Tom Peterka, Jian Huang, David Erickson:
Simplified parallel domain traversal. 10:1-10:11 - Naoya Maruyama, Tatsuo Nomura, Kento Sato, Satoshi Matsuoka:
Physis: an implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers. 11:1-11:12
GPU optimizations
- Michael Bauer, Henry Cook, Brucek Khailany:
CudaDMA: optimizing GPU memory bandwidth via warp specialization. 12:1-12:11 - Shuai Che, Jeremy W. Sheaffer, Kevin Skadron:
Dymaxion: optimizing memory access patterns for heterogeneous systems. 13:1-13:11 - Jiayuan Meng, Vitali A. Morozov, Kalyan Kumaran, Venkatram Vishwanath, Thomas D. Uram:
GROPHECY: GPU performance projection from CPU code skeletons. 14:1-14:11
Best paper finalists
- Robert Preissl, Nathan Wichmann, Bill Long, John Shalf, Stéphane Ethier, Alice E. Koniges:
Multithreaded global address space communication techniques for gyrokinetic fusion applications on ultra-scale platforms. 78:1-78:11 - John K. Salmon, Mark A. Moraes, Ron O. Dror, David E. Shaw:
Parallel random numbers: as easy as 1, 2, 3. 16:1-16:12
Coordinating I/O
- Huaiming Song, Yanlong Yin, Xian-He Sun, Rajeev Thakur, Samuel Lang:
Server-side I/O coordination for parallel file systems. 17:1-17:11 - Xuechen Zhang, Kei Davis, Song Jiang:
QoS support for end users of I/O-intensive applications using shared storage systems. 18:1-18:12 - Venkatram Vishwanath, Mark Hereld, Vitali A. Morozov, Michael E. Papka:
Topology-aware data movement and staging for I/O acceleration on Blue Gene/P supercomputing systems. 19:1-19:11
Power optimization
- Iñigo Goiri, Ryan Beauchea, Kien Le, Thu D. Nguyen, Md. Enamul Haque, Jordi Guitart, Jordi Torres, Ricardo Bianchini:
GreenSlot: scheduling energy consumption in green datacenters. 20:1-20:11 - Osman Sarood, Laxmikant V. Kalé:
A 'cool' load balancer for parallel applications. 21:1-21:11 - Kien Le, Ricardo Bianchini, Jingru Zhang, Yogesh Jaluria, Jiandong Meng, Thu D. Nguyen:
Reducing electricity cost through virtual machine placement in high performance computing clouds. 22:1-22:12
Applications
- Kamesh Madduri, Khaled Z. Ibrahim, Samuel Williams, Eun-Jin Im, Stéphane Ethier, John Shalf, Leonid Oliker:
Gyrokinetic toroidal simulations on leading multi- and manycore HPC systems. 23:1-23:12 - George Vahala, Min Soe, Bo Zhang, Jeffrey Yepez, Linda Vahala, Jonathan Carter, Sean Ziegeler:
Unitary qubit lattice simulations of multiscale phenomena in quantum turbulence. 24:1-24:11 - Kenneth Moreland, Wesley Kendall, Tom Peterka, Jian Huang:
An image compositing solution at scale. 25:1-25:10
Large scale systems
- Dong Chen, Noel Eisley, Philip Heidelberger, Robert M. Senger, Yutaka Sugawara, Sameer Kumar, Valentina Salapura, David L. Satterfield, Burkhard D. Steinmacher-Burow, Jeffrey J. Parker:
The IBM Blue Gene/Q interconnection network and message unit. 26:1-26:10 - Eitan Frachtenberg, Ali Heydari, Harry Li, Amir Michael, Jacob Na, Avery Nisbet, Pierluigi Sarti:
High-efficiency server design. 27:1-27:27 - Peter M. Kogge, Timothy J. Dysart:
Using the TOP500 to trace and project technology and architecture trends. 28:1-28:11
Querying large scale data
- Kalin Kanov, Eric A. Perlman, Randal C. Burns, Yanif Ahmad, Alexander S. Szalay:
I/O streaming evaluation of batch queries for data-intensive computational turbulence. 29:1-29:10 - Jerry Chi-Yuan Chou, Mark Howison, Brian Austin, Kesheng Wu, Ji Qiang, E. Wes Bethel, Arie Shoshani, Oliver Rübel, Prabhat, Robert D. Ryne:
Parallel index and query for large scale data analysis. 30:1-30:11 - Sriram Lakshminarasimhan, John Jenkins, Isha Arkatkar, Zhenhuan Gong, Hemanth Kolla, Seung-Hoe Ku, Stéphane Ethier, Jackie Chen, Choong-Seock Chang, Scott Klasky, Robert Latham, Robert B. Ross, Nagiza F. Samatova:
ISABELA-QA: query-driven analytics with ISABELA-compressed extreme-scale scientific data. 31:1-31:11
Checkpointing optimization
- Leonardo Arturo Bautista-Gomez, Seiji Tsuboi, Dimitri Komatitsch, Franck Cappello, Naoya Maruyama, Satoshi Matsuoka:
FTI: high performance fault tolerance interface for hybrid systems. 32:1-32:32 - Marin Bougeret, Henri Casanova, Mikaël Rabie, Yves Robert, Frédéric Vivien:
Checkpointing strategies for parallel jobs. 33:1-33:11 - Bogdan Nicolae, Franck Cappello:
BlobCR: efficient checkpoint-restart for HPC applications on IaaS clouds using virtual disk image snapshots. 34:1-34:12
GPU applications
- Guangming Tan, Linchuan Li, Sean Triechle, Everett H. Phillips, Yungang Bao, Ninghui Sun:
Fast implementation of DGEMM on Fermi GPU. 35:1-35:11 - Qi Hu, Nail A. Gumerov, Ramani Duraiswami:
Scalable fast multipole methods on distributed heterogeneous architectures. 36:1-36:12 - Hemant Shukla, Hsi-Yu Schive, Tak-Pong Woo, Tzihong Chiueh:
Multi-science applications with single codebase - GAMER - for massively parallel architectures. 37:1-37:11
Storage and memory
- Michael R. Frasca, Ramya Prabhakar, Padma Raghavan, Mahmut T. Kandemir:
Virtual I/O caching: dynamic storage cache management for concurrent workloads. 38:1-38:11 - XiaoJian Wu, A. L. Narasimha Reddy:
SCMFS: a file system for storage class memory. 39:1-39:11 - Khaled Z. Ibrahim, Steven A. Hofmeyr, Costin Iancu, Eric Roman:
Optimized pre-copy live migration for memory intensive applications. 40:1-40:11
Performance evaluation and analysis
- Eric L. Goodman, M. Nicole Lemaster, Edward Jimenez:
Scalable hashing for shared memory supercomputers. 41:1-41:11 - Kevin J. Barker, Adolfy Hoisie, Darren J. Kerbyson:
An early performance analysis of POWER7-IH HPC systems. 42:1-42:11 - Mario Lassnig, Thomas Fahringer, Vincent Garonne, Angelos Molfetas, Martin Barisits:
A similarity measure for time, frequency, and dependencies in large-scale workloads. 43:1-43:11
Reliability
- Kurt B. Ferreira, Jon Stearley, James H. Laros III, Ron A. Oldfield, Kevin T. Pedretti, Ron Brightwell, Rolf Riesen, Patrick G. Bridges, Dorian C. Arnold:
Evaluating the viability of process replication reliability for exascale systems. 44:1-44:12 - Eric Martin Heien, Derrick Kondo, Ana Gainaru, Dan Lapine, Bill Kramer, Franck Cappello:
Modeling and tolerating heterogeneous failures in large parallel systems. 45:1-45:11 - Sheng Li, Ke Chen, Ming-yu Hsieh, Naveen Muralimanohar, Chad D. Kersey, Jay B. Brockman, Arun F. Rodrigues, Norman P. Jouppi:
System implications of memory reliability in exascale computing. 46:1-46:12
Scheduling and resource allocation
- Ron Chi-Lung Chiang, H. Howie Huang:
TRACON: interference-aware scheduling for data-intensive applications in virtualized environments. 47:1-47:12 - Thomas J. Hacker, Kanak Mahadik:
Flexible resource allocation for reliable virtual cluster computing systems. 48:1-48:12 - Ming Mao, Marty Humphrey:
Auto-scaling to minimize cost and meet application deadlines in cloud workflows. 49:1-49:12
Debugging
- Ignacio Laguna, Todd Gamblin, Bronis R. de Supinski, Saurabh Bagchi, Greg Bronevetsky, Dong H. Ahn, Martin Schulz, Barry Rountree:
Large scale debugging of parallel tasks with AutomaDeD. 50:1-50:10 - Chang-Seo Park, Koushik Sen, Paul Hargrove, Costin Iancu:
Efficient data race detection for distributed memory parallel programs. 51:1-51:12
Multicore architectural tools
- Trevor E. Carlson, Wim Heirman, Lieven Eeckhout:
Sniper: exploring the level of abstraction for scalable and accurate parallel multi-core simulation. 52:1-52:12 - Karthik Ganesan, Lizy K. John:
MAximum Multicore POwer (MAMPO): an automatic multithreaded synthetic power virus generation framework for multicore systems. 53:1-53:12
Application performance
- Patrick H. Worley, Arthur A. Mirin, Anthony P. Craig, Mark A. Taylor, John M. Dennis, Mariana Vertenstein:
Performance of the community earth system model. 54:1-54:11 - Samuel Williams, Leonid Oliker, Jonathan Carter, John Shalf:
Extracting ultra-scale Lattice Boltzmann performance via hierarchical and distributed auto-tuning. 55:1-55:12 - Benoit Marchand, Vladimir B. Bajic, Dinesh K. Kaushik:
Highly scalable ab initio genomic motif identification. 56:1-56:10
MapReduce
- Yandong Wang, Xinyu Que, Weikuan Yu, Dror Goldenberg, Dhiraj Sehgal:
Hadoop acceleration through network levitated merge. 57:1-57:10 - Balaji Palanisamy, Aameek Singh, Ling Liu, Bhushan Jain:
Purlieus: locality-aware resource allocation for MapReduce in a cloud. 58:1-58:11 - Atilla Soner Balkir, Ian T. Foster, Andrey Rzhetsky:
A distributed look-up architecture for text mining applications using MapReduce. 59:1-59:11
Molecular dynamics and computational physics
- Sander Pronk, Per Larsson, Iman Pouya, Gregory R. Bowman, Imran S. Haque, Kyle Beauchamp, Berk Hess, Vijay S. Pande, Peter M. Kasson, Erik Lindahl:
Copernicus: a new paradigm for parallel adaptive molecular dynamics. 60:1-60:10 - Chao Mei, Yanhua Sun, Gengbin Zheng, Eric J. Bohm, Laxmikant V. Kalé, James C. Phillips, Chris Harrison:
Enabling and scaling biomolecular simulations of 100 million atoms on petascale machines with a multicore-optimized message-driven runtime. 61:1-61:11 - Susumu Yamada, Toshiyuki Imamura, Masahiko Machida:
Parallelization design on multi-core platforms in density matrix renormalization group toward 2-D quantum strongly-correlated systems. 62:1-62:10
Applications
- Andy Yoo, Allison H. Baker, Roger A. Pearce, Van Emden Henson:
A scalable eigensolver for large scale-free graphs using 2D graph partitioning. 63:1-63:11 - Miles Lubin, Cosmin G. Petra, Mihai Anitescu, Victor M. Zavala:
Scalable stochastic optimization of complex energy systems. 64:1-64:64 - Aydin Buluç, Kamesh Madduri:
Parallel breadth-first search on distributed memory systems. 65:1-65:12
MapReduce and network QoS
- Joe B. Buck, Noah Watkins, Jeff LeFevre, Kleoni Ioannidou, Carlos Maltzahn, Neoklis Polyzotis, Scott A. Brandt:
SciHadoop: array-based query processing in Hadoop. 66:1-66:11 - Wittawat Tantisiriroj, Seung Woo Son, Swapnil Patil, Samuel Lang, Garth Gibson, Robert B. Ross:
On the duality of data-intensive file system design: reconciling HDFS and PVFS. 67:1-67:12 - Sushant Sharma, Dimitrios Katramatos, Dantong Yu:
End-to-end network QoS via scheduling of flexible resource reservation requests. 68:1-68:10
QCD and DFT
- Mikhail Smelyanskiy, Karthikeyan Vaidyanathan, Jee W. Choi, Bálint Joó, Jatin Chhugani, Michael A. Clark, Pradeep Dubey:
High-performance lattice QCD for multi-core based parallel systems using a cache-friendly hybrid threaded-MPI approach. 69:1-69:11 - Ronald Babich, Michael A. Clark, Bálint Joó, Guochun Shi, Richard C. Brower, Steven A. Gottlieb:
Scaling lattice QCD beyond 100 GPUs. 70:1-70:11 - Long Wang, Yue Wu, Weile Jia, Weiguo Gao, Xuebin Chi, Lin-Wang Wang:
Large scale plane wave pseudopotential density functional theory calculations on GPU clusters. 71:1-71:10
Applications
- Karol Kowalski, Sriram Krishnamoorthy, Ryan M. Olson, Vinod Tipparaju, Edoardo Aprà:
Scalable implementations of accurate excited-state coupled cluster theories: application of high-level methods to porphyrin-based systems. 72:1-72:10 - Jens Krueger, David Donofrio, John Shalf, Marghoob Mohiyuddin, Samuel Williams, Leonid Oliker, Franz-Josef Pfreundt:
Hardware/software co-design for energy-efficient seismic modeling. 73:1-73:12 - Gerhard Niederbrucker, Wilfried N. Gansterer:
A fast solver for modeling the evolution of virus populations. 74:1-74:11
Optimizing communication performance
- Junchao Zhang, Babak Behzad, Marc Snir:
Optimizing the Barnes-Hut algorithm in UPC. 75:1-75:11 - Abhinav Bhatele, Nikhil Jain, William D. Gropp, Laxmikant V. Kalé:
Avoiding hot-spots on two-level direct networks. 76:1-76:11 - Edgar Solomonik, Abhinav Bhatele, James Demmel:
Improving communication performance in dense linear algebra via topology aware collectives. 77:1-77:11
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.