default search action
SC 2012: Salt Lake City, UT, USA
- Jeffrey K. Hollingsworth:
SC Conference on High Performance Computing Networking, Storage and Analysis, SC '12, Salt Lake City, UT, USA - November 11 - 15, 2012. IEEE/ACM 2012, ISBN 978-1-4673-0804-5
ACM Gordon Bell finalist: ACM Gordon Bell prize I
- Jatin Chhugani, Changkyu Kim, Hemant Shukla, Jongsoo Park, Pradeep Dubey, John Shalf, Horst D. Simon:
Billion-particle SIMD-friendly two-point correlation on large-scale HPC cluster systems. 1 - Arthur A. Mirin, David F. Richards, James N. Glosli, Erik W. Draeger, Bor Chan, Jean-Luc Fattebert, William D. Krauss, Tomas Oppelstrup, John Jeremy Rice, John A. Gunnels, Viatcheslav Gurev, Changhoan Kim, John Magerlein, Matthias Reumann, Hui-Fang Wen:
Toward real-time modeling of human heart ventricles at cellular resolution: simulation of drug-induced arrhythmias. 2 - Tan Bui-Thanh, Carsten Burstedde, Omar Ghattas, James Martin, Georg Stadler, Lucas C. Wilcox:
Extreme-scale UQ for Bayesian inverse problems governed by PDEs. 3
ACM Gordon Bell finalist: ACM Gordon Bell prize II
- Salman Habib, Vitali A. Morozov, Hal Finkel, Adrian Pope, Katrin Heitmann, Kalyan Kumaran, Tom Peterka, Joseph A. Insley, David Daniel, Patricia K. Fasel, Nicholas Frontiere, Zarija Lukic:
The universe at extreme scale: multi-petaflop sky simulation on the BG/Q. 4 - Tomoaki Ishiyama, Keigo Nitadori, Junichiro Makino:
4.45 Pflops astrophysical N-body simulation on K computer: the gravitational trillion-body problem. 5
Analysis of I/O and storage
- Robert Henschel, Stephen C. Simms, David Y. Hancock, Scott Michael, Tom Johnson, Nathan Heald, Thomas William, Donald K. Berry, Matthew Allen, Richard Knepper, Matt Davy, Matthew R. Link, Craig A. Stewart:
Demonstrating lustre over a 100Gbps wide area network of 3, 500km. 6 - Dirk Meister, Jürgen Kaiser, André Brinkmann, Toni Cortes, Michael Kuhn, Julian M. Kunkel:
A study on data deduplication in HPC storage systems. 7 - Bing Xie, Jeffrey S. Chase, David Dillow, Oleg Drokin, Scott Klasky, Sarp Oral, Norbert Podhorszki:
Characterizing output bottlenecks in a supercomputer. 8
Autotuning and search-based optimization
- Dheya Mustafa, Rudolf Eigenmann:
Portable section-level tuning of compiler parallelized applications. 9 - Herbert Jordan, Peter Thoman, Juan Jose Durillo Barrionuevo, Simone Pellegrini, Philipp Gschwandtner, Thomas Fahringer, Hans Moritsch:
A multi-objective auto-tuning framework for parallel codes. 10 - Matthias Christen, Olaf Schenk, Yifeng Cui:
Patus for convenient high-performance stencils: evaluation in earthquake simulations. 11
Breadth first search
- Scott Beamer, Krste Asanovic, David A. Patterson:
Direction-optimizing breadth-first search. 12 - Fabio Checconi, Fabrizio Petrini, Jeremiah Willcock, Andrew Lumsdaine, Anamitra R. Choudhury, Yogish Sabharwal:
Breaking the speed and scalability barriers for graph exploration on distributed-memory machines. 13 - Nadathur Satish, Changkyu Kim, Jatin Chhugani, Pradeep Dubey:
Large-scale energy-efficient graph traversal: a path to efficient data-intensive supercomputing. 14
Direct numerical simulations
- John M. Levesque, Ramanan Sankaran, Ray W. Grout:
Hybridizing S3D into an exascale application using OpenACC: an approach for moving to multi-petaflops and beyond. 15 - Babak Hejazialhosseini, Diego Rossinelli, Christian Conti, Petros Koumoutsakos:
High throughput software for direct numerical simulations of compressible two-phase flows. 16
Checkpointing
- Tanzima Zerin Islam, Kathryn M. Mohror, Saurabh Bagchi, Adam Moody, Bronis R. de Supinski, Rudolf Eigenmann:
McrEngine: a scalable checkpointing system using data-aware aggregation and compression. 17 - Rolf Riesen, Kurt B. Ferreira, Dilma Da Silva, Pierre Lemarinier, Dorian C. Arnold, Patrick G. Bridges:
Alleviating scalability issues of checkpointing protocols. 18 - Kento Sato, Naoya Maruyama, Kathryn M. Mohror, Adam Moody, Todd Gamblin, Bronis R. de Supinski, Satoshi Matsuoka:
Design and modeling of a non-blocking checkpointing system. 19
Cloud computing
- Thanasis G. Papaioannou, Nicolas Bonvin, Karl Aberer:
Scalia: an adaptive scheme for efficient multi-cloud storage. 20 - Sheng Di, Derrick Kondo, Walfredo Cirne:
Host load prediction in a Google compute cloud with a Bayesian model. 21 - Maciej Malawski, Gideon Juve, Ewa Deelman, Jarek Nabrzyski:
Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds. 22
GPU programming models and patterns
- Seyong Lee, Jeffrey S. Vetter:
Early evaluation of directive-based GPU programming models for productive exascale computing. 23 - Jacques A. Pienaar, Srimat T. Chakradhar, Anand Raghunathan:
Automatic generation of software pipelines for heterogeneous parallel systems. 24 - Linchuan Chen, Xin Huo, Gagan Agrawal:
Accelerating MapReduce on a coupled CPU-GPU architecture. 25
Maximizing performance on multi-core and many-core architectures
- Francisco D. Igual, Murtaza Ali, Arnon Friedmann, Eric Stotzer, Timothy Wentz, Robert A. van de Geijn:
Unleashing the high-performance and low-power of multi-core DSPs for general-purpose HPC. 26 - Li-Wen Chang, John A. Stratton, Hee-Seok Kim, Wen-mei W. Hwu:
A scalable, numerically stable, high-performance tridiagonal solver using GPUs. 27 - Jongsoo Park, Ping Tak Peter Tang, Mikhail Smelyanskiy, Daehyun Kim, Thomas Benson:
Efficient backprojection-based synthetic aperture radar computation with many-core processors. 28
Auto-diagnosis of correctness and performance issues
- Peng Li, Guodong Li, Ganesh Gopalakrishnan:
Parametric flows: automated behavior equivalencing for symbolic analysis of races in CUDA programs. 29 - Tobias Hilbrich, Joachim Protze, Martin Schulz, Bronis R. de Supinski, Matthias S. Müller:
MPI runtime error detection with MUST: advances in deadlock detection. 30 - Abhinav Bhatele, Todd Gamblin, Katherine E. Isaacs, Brian T. N. Gunney, Martin Schulz, Peer-Timo Bremer, Bernd Hamann:
Novel views of performance data to analyze large-scale adaptive applications. 31
DRAM power and resiliency management
- Donghong Wu, Bingsheng He, Xueyan Tang, Jianliang Xu, Minyi Guo:
RAMZzz: rank-aware dram power management with dynamic migrations and demotions. 32 - Sheng Li, Doe Hyun Yoon, Ke Chen, Jishen Zhao, Jung Ho Ahn, Jay B. Brockman, Yuan Xie, Norman P. Jouppi:
MAGE: adaptive granularity and ECC for resilient and power efficient memory systems. 33
Grids/clouds networking
- Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas G. Robertazzi, Brian Tierney, Eric Pouyoul:
Protocols for wide-area data-intensive applications: design and performance issues. 34 - Nusrat S. Islam, Md. Wasi-ur-Rahman, Jithin Jose, Raghunath Rajachandrasekar, Hao Wang, Hari Subramoni, Chet Murthy, Dhabaleswar K. Panda:
High performance RDMA-based design of HDFS over InfiniBand. 35 - Kiril Dichev, Fergal Reid, Alexey L. Lastovetsky:
Efficient and reliable network tomography in heterogeneous networks using BitTorrent broadcasts and clustering algorithms. 36
Weather and seismic simulations
- Preeti Malakar, Thomas George, Sameer Kumar, Rashmi Mittal, Vijay Natarajan, Yogish Sabharwal, Vaibhav Saxena, Sathish S. Vadhiyar:
A divide and conquer strategy for scaling weather simulations with multiple regions of interest. 37 - Max Rietmann, Peter Messmer, Tarje Nissen-Meyer, Daniel Peter, Piero Basini, Dimitri Komatitsch, Olaf Schenk, Jeroen Tromp, Lapo Boschi, Domenico Giardini:
Forward and adjoint simulations of seismic wave propagation on emerging large-scale GPU architectures. 38
Compiler-based analysis and optimization
- Tan Nguyen, Pietro Cicotti, Eric J. Bylaska, Dan Quinlan, Scott B. Baden:
Bamboo: translating MPI applications to a latency-tolerant, data-driven form. 39 - Vinayaka Bandishti, Irshad Pananilath, Uday Bondhugula:
Tiling stencil computations to maximize parallelism. 40 - Wei Ding, Yuanrui Zhang, Mahmut T. Kandemir, Seung Woo Son:
Compiler-directed file layout optimization for hierarchical storage systems. 41
Fast algorithms
- Ping Tak Peter Tang, Jongsoo Park, Daehyun Kim, Vladimir Petrov:
A framework for low-communication 1-D FFT. 42 - Hari Sundar, George Biros, Carsten Burstedde, Johann Rudi, Omar Ghattas, Georg Stadler:
Parallel geometric-algebraic multigrid on unstructured forests of octrees. 43 - Akira Nukada, Kento Sato, Satoshi Matsuoka:
Scalable multi-GPU 3-D FFT for TSUBAME 2.0 supercomputer. 44
Massively parallel simulations
- Jun Doi:
Peta-scale lattice quantum chromodynamics on a blue gene/Q supercomputer. 45 - Abhinav Sarje, Xiaoye S. Li, Slim Chourou, Elaine R. Chan, Alexander Hexemer:
Massively parallel X-ray scattering simulations. 46 - Christopher Baker, Gregory G. Davidson, Thomas M. Evans, Steven P. Hamilton, Joshua J. Jarrell, Wayne Joubert:
High performance radiation transport simulations: preparing for Titan. 47
Optimizing I/O for analytics
- John Jenkins, Eric R. Schendel, Sriram Lakshminarasimhan, David A. Boyuka II, Terry Rogers, Stéphane Ethier, Robert B. Ross, Scott Klasky, Nagiza F. Samatova:
Byte-precision level of detail processing for variable precision analytics. 48 - Janine Bennett, Hasan Abbasi, Peer-Timo Bremer, Ray W. Grout, Attila Gyulassy, Tong Jin, Scott Klasky, Hemanth Kolla, Manish Parashar, Valerio Pascucci, Philippe P. Pébay, David C. Thompson, Hongfeng Yu, Fan Zhang, Jacqueline Chen:
Combining in-situ and in-transit processing to enable extreme-scale scientific analysis. 49 - Sidharth Kumar, Venkatram Vishwanath, Philip H. Carns, Joshua A. Levine, Robert Latham, Giorgio Scorzelli, Hemanth Kolla, Ray W. Grout, Robert B. Ross, Michael E. Papka, Jacqueline Chen, Valerio Pascucci:
Efficient data restructuring and aggregation for I/O acceleration in PIDX. 50
Datacenter technologies
- Melanie Kambadur, Tipp Moseley, Rick Hank, Martha A. Kim:
Measuring interference between live datacenter applications. 51 - Rini T. Kaushik, Klara Nahrstedt:
T: a data-centric cooling energy costs reduction approach for big data analytics cloud. 52 - Vignesh T. Ravi, Michela Becchi, Gagan Agrawal, Srimat T. Chakradhar:
ValuePack: value-based scheduling framework for CPU-GPU clusters. 53
Optimizing application performance
- Robert Preissl, Theodore M. Wong, Pallab Datta, Myron Flickner, Raghavendra Singh, Steven K. Esser, William P. Risk, Horst D. Simon, Dharmendra S. Modha:
Compass: a scalable simulator for an architecture for cognitive computing. 54 - Yanhua Sun, Gengbin Zheng, Chao Mei, Eric J. Bohm, James C. Phillips, Laximant V. Kalé, Terry R. Jones:
Optimizing fine-grained communication in a biomolecular simulation application on Cray XK6. 55 - Yuri Alexeev, Ashutosh Mahajan, Sven Leyffer, Graham Fletcher, Dmitri G. Fedorov:
Heuristic static load-balancing algorithm applied to the fragment molecular orbital method. 56
Resilience
- Dong Li, Jeffrey S. Vetter, Weikuan Yu:
Classifying soft error vulnerabilities in extreme-scale scientific applications using a binary instrumentation tool. 57 - Jinsuk Chung, Ikhwan Lee, Michael B. Sullivan, Jee Ho Ryoo, Dong-Wan Kim, Doe Hyun Yoon, Larry Kaplan, Mattan Erez:
Containment domains: a scalable, efficient, and flexible resilience scheme for exascale systems. 58
Visualization and analysis of massive data sets
- Surendra Byna, Jerry Chi-Yuan Chou, Oliver Rübel, Prabhat, Homa Karimabadi, William S. Daughton, Vadim Roytershteyn, E. Wes Bethel, Mark Howison, Ke-Jou Hsu, Kuan-Wu Lin, Arie Shoshani, Andrew Uselton, Kesheng Wu:
Parallel I/O, analysis, and visualization of a trillion particle simulation. 59 - Kalin Kanov, Randal C. Burns, Gregory L. Eyink, Charles Meneveau, Alexander S. Szalay:
Data-intensive spatial filtering in large numerical simulation datasets. 60 - Boonthanome Nouanesengsy, Teng-Yok Lee, Kewei Lu, Han-Wei Shen, Tom Peterka:
Parallel particle advection and FTLE computation for time-varying flow fields. 61
Graph algorithms
- Md. Mostofa Ali Patwary, Diana Palsetia, Ankit Agrawal, Wei-keng Liao, Fredrik Manne, Alok N. Choudhary:
A new scalable parallel DBSCAN algorithm using the disjoint-set data structure. 62 - Olga Nikolova, Srinivas Aluru:
Parallel Bayesian network structure learning with application to gene networks. 63 - Arif M. Khan, David F. Gleich, Alex Pothen, Mahantesh Halappanavar:
A multithreaded algorithm for network alignment via approximate matching. 64
Locality in programming models and runtimes
- Stephen Olivier, Bronis R. de Supinski, Martin Schulz, Jan F. Prins:
Characterizing and mitigating work time inflation in task parallel programs. 65 - Michael Bauer, Sean Treichler, Elliott Slaughter, Alex Aiken:
Legion: expressing locality and independence with logical regions. 66 - Michael Garland, Manjunath Kudlur, Yili Zheng:
Designing a unified programming model for heterogeneous machines. 67
Networks
- Sushant Sharma, Dimitrios Katramatos, Dantong Yu, Li Shi:
Design and implementation of an intelligent end-to-end network QoS system. 68 - Dong Chen, Noel Eisley, Philip Heidelberger, Sameer Kumar, Amith R. Mamidala, Fabrizio Petrini, Robert M. Senger, Yutaka Sugawara, Robert Walkup, Burkhard D. Steinmacher-Burow, Anamitra R. Choudhury, Yogish Sabharwal, Swati Singhal, Jeffrey J. Parker:
Looking under the hood of the IBM blue gene/Q network. 69 - Hari Subramoni, Sreeram Potluri, Krishna Chaitanya Kandalla, Bill Barth, Jérôme Vienne, Jeff Keasler, Karen A. Tomko, Karl W. Schulz, Adam Moody, Dhabaleswar K. Panda:
Design of a scalable InfiniBand topology service to enable network-topology-aware placement of processes. 70
Runtime-based analysis and optimization
- Guancheng Chen, Per Stenström:
Critical lock analysis: diagnosing critical section bottlenecks in multithreaded applications. 71 - Mahesh Ravishankar, John Eisenlohr, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, P. Sadayappan:
Code generation for parallel execution of a class of irregular loops on distributed memory systems. 72
Cosmology applications
- Jean-Michel Alimi, Vincent Bouillot, Yann Rasera, Vincent Reverdy, Pier-Stefano Corasaniti, Irène Balmès, Stéphane Requena, Xavier Delaruelle, Jean-Noel Richet:
First-ever full observable universe simulation. 73 - William B. March, Kenneth Czechowski, Marat Dukhan, Thomas Benson, Dongryeol Lee, Andrew J. Connolly, Richard W. Vuduc, Edmond Chow, Alexander G. Gray:
Optimizing the computation of n-point correlations on large-scale astronomical data. 74 - Jingjin Wu, Zhiling Lan, Xuanxing Xiong, Nickolay Y. Gnedin, Andrey V. Kravtsov:
Hierarchical task mapping of cell-based AMR cosmology simulations. 75
Fault detection and analysis
- Vilas Sridharan, Dean Liberty:
A study of DRAM failures in the field. 76 - Ana Gainaru, Franck Cappello, Marc Snir, William Kramer:
Fault prediction under the microscope: a closer look into HPC systems. 77 - David Fiala, Frank Mueller, Christian Engelmann, Rolf Riesen, Kurt B. Ferreira, Ron Brightwell:
Detection and correction of silent data corruption for large-scale high-performance computing. 78
Grid computing
- Dmytro Karpenko, Roman Vitenberg, Alexander L. Read:
ATLAS grid workload on NDGF resources: analysis, modeling, and workload generation. 79 - Trilce Estrada, Michela Taufer:
On the effectiveness of application-aware self-management for scientific discovery in volunteer computing systems. 80 - Zhengyang Liu, Malathi Veeraraghavan, Zhenzhen Yan, Chris Tracy, Jing Tie, Ian T. Foster, John M. Dennis, Jason Hick, Yee-Ting Li, W. Yang:
On using virtual circuits for GridFTP transfers. 81
Performance modeling
- Jiayuan Meng, Vitali A. Morozov, Venkatram Vishwanath, Kalyan Kumaran:
Dataflow-driven GPU performance projection for multi-kernel transformations. 82 - Tyler Dwyer, Alexandra Fedorova, Sergey Blagodurov, Mark Roth, Fabien Gaud, Jian Pei:
A practical method for estimating performance degradation on multicore processors, and its application to HPC workloads. 83 - Kyle Spafford, Jeffrey S. Vetter:
Aspen: a domain specific language for performance modeling. 84
Big data
- Zhao Zhang, Daniel S. Katz, Justin M. Wozniak, Allan Espinosa, Ian T. Foster:
Design and analysis of data management in scalable parallel scripting. 85 - Ian F. Adams, Brian A. Madden, Joel Cameron Frank, Mark W. Storer, Ethan L. Miller, Gene Harano:
Usage behavior of a large-scale scientific archive. 86 - Jharrod Lafon, Satyajayant Misra, Jon Bringhurst:
On distributed file tree walk of parallel file systems. 87
Memory systems
- I-Hsin Chung, Changhoan Kim, Hui-Fang Wen, Guojing Cong:
Application data prefetching on the IBM blue gene/Q supercomputer. 88 - Lluc Alvarez, Lluís Vilanova, Marc González, Xavier Martorell, Nacho Navarro, Eduard Ayguadé:
Hardware-software coherence protocol for the coexistence of caches and local memories. 89 - Martin Schindewolf, Barna L. Bihari, John C. Gyllenhaal, Martin Schulz, Amy Wang, Wolfgang Karl:
What scientific applications can benefit from hardware transactional memory? 90
Numerical algorithms
- Laura Grigori, Radek Stompor, Mikolaj Szydlarski:
A parallel two-level preconditioner for cosmic microwave background map-making. 91 - Robert Speck, Daniel Ruprecht, Rolf Krause, Matthew Emmett, Michael L. Minion, Mathias Winkel, Paul Gibbon:
A massively space-time parallel N-body solver. 92 - Katsuki Fujisawa, Hitoshi Sato, Satoshi Matsuoka, Toshio Endo, Makoto Yamashita, Maho Nakata:
High-performance general solver for extremely large-scale semidefinite programming problems. 93
Performance optimization
- Rob F. Van der Wijngaart, Srinivas Sridharan, Victor W. Lee:
Extending the BT NAS parallel benchmark to exascale computing. 94 - Michael R. Frasca, Kamesh Madduri, Padma Raghavan:
NUMA-aware graph mining techniques for performance and energy efficiency. 95 - Samuel Williams, Dhiraj D. Kalamkar, Amik Singh, Anand M. Deshpande, Brian van Straalen, Mikhail Smelyanskiy, Ann S. Almgren, Pradeep Dubey, John Shalf, Leonid Oliker:
Optimization of geometric multigrid for emerging multi- and manycore processors. 96
Communication optimization
- Abhinav Bhatele, Todd Gamblin, Steve H. Langer, Peer-Timo Bremer, Erik W. Draeger, Bernd Hamann, Katherine E. Isaacs, Aaditya G. Landge, Joshua A. Levine, Valerio Pascucci, Martin Schulz, Charles H. Still:
Mapping applications with collectives over sub-communicators on torus networks. 97 - Torsten Hoefler, Timo Schneider:
Optimization principles for collective neighborhood communications. 98 - Zheng Cui, Lei Xia, Patrick G. Bridges, Peter A. Dinda, John R. Lange:
Optimizing overlay-based virtual networking through optimistic interrupts and cut-through forwarding. 99
Linear algebra algorithms
- Evangelos Georganas, Jorge González-Domínguez, Edgar Solomonik, Yili Zheng, Juan Touriño, Katherine A. Yelick:
Communication avoiding and overlapping for numerical linear algebra. 100 - Benjamin Lipshitz, Grey Ballard, James Demmel, Oded Schwartz:
Communication-avoiding parallel strassen: implementation and performance. 101 - Haim Avron, Anshul Gupta:
Managing data-movement for effective shared-memory parallelization of out-of-core sparse solvers. 102
New computer systems
- Greg Faanes, Abdulla Bataineh, Duncan Roweth, Tom Court, Edwin Froese, Robert Alverson, Tim Johnson, Joe Kopnick, Mike Higgins, James Reinhard:
Cray cascade: a scalable HPC system based on a Dragonfly network. 103 - Junichiro Makino, Hiroshi Daisaka:
GRAPE-8: an accelerator for gravitational N-body simulation with 20.5Gflops/W performance. 104 - Greg Thorson, Michael Woodacre:
SGI® UV2: a fused computation and data analysis machine. 105
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.