default search action
Parallel Computing, Volume 39
Volume 39, Number 1, January 2013
- Dana Jacobsen, Inanc Senocak:
Multi-level parallelism for incompressible flow computations on GPU clusters. 1-20 - Masha Sosonkina, Layne T. Watson, Nicholas R. Radcliffe, Raphael T. Haftka, Michael W. Trosset:
Adjusting process count on demand for petascale global optimization. 21-35 - Diego Andrade, Basilio B. Fraguela, Ramon Doallo:
Accurate prediction of the behavior of multithreaded applications in shared caches. 36-57 - Orlando Ayala, Lian-Ping Wang:
Parallel implementation and scalability analysis of 3D Fast Fourier Transform using 2D domain decomposition. 58-77
Volume 39, Number 2, February 2013
- Abhinav Sarje, Srinivas Aluru:
All-pairs computations on many-core graphics processors. 79-93 - Ferit Büyükkeçeci, Omar Awile, Ivo F. Sbalzarini:
A portable OpenCL implementation of generic particle-mesh and mesh-particle interpolation in 2D and 3D. 94-111
Volume 39, Number 3, March 2013
- Mark W. Krentel:
Libmonitor: A tool for first-party monitoring. 114-119 - Nick Rutar, Jeffrey K. Hollingsworth:
Software techniques for negating skid and approximating cache miss measurements. 120-131 - Marc-André Hermanns, Sriram Krishnamoorthy, Felix Wolf:
A scalable infrastructure for the performance analysis of passive target synchronization. 132-145 - Michael O. Lam, Jeffrey K. Hollingsworth, G. W. Stewart:
Dynamic floating-point cancellation detection. 146-155 - Barry Rountree, Todd Gamblin, Bronis R. de Supinski, Martin Schulz, David K. Lowenthal, Guy Cobb, Henry M. Tufo:
Parallelizing heavyweight debugging tools with mpiecho. 156-166 - Joshua D. Goehner, Dorian C. Arnold, Dong H. Ahn, Gregory L. Lee, Bronis R. de Supinski, Matthew P. LeGendre, Barton P. Miller, Martin Schulz:
LIBI: A framework for bootstrapping extreme scale software systems. 167-176
Volume 39, Numbers 4-5, April - May 2013
- Sen Su, Jian Li, Qingjia Huang, Xiao Huang, Kai Shuang, Jie Wang:
Cost-efficient task scheduling for executing large programs in the cloud. 177-188 - George Teodoro, Tony Pan, Tahsin M. Kurç, Jun Kong, Lee A. D. Cooper, Joel H. Saltz:
Efficient irregular wavefront propagation algorithms on hybrid CPU-GPU machines. 189-211 - Jack J. Dongarra, Mathieu Faverge, Thomas Hérault, Mathias Jacquelin, Julien Langou, Yves Robert:
Hierarchical QR factorization algorithms for multi-core clusters. 212-232 - Wagner Kolberg, Pedro de B. Marcos, Julio C. S. dos Anjos, Alexandre K. S. Miyazaki, Cláudio Fernando Resin Geyer, Luciana Arantes:
MRSG - A MapReduce simulator over SimGrid. 233-244
Volume 39, Numbers 6-7, June - July 2013
- Andrew V. Terekhov:
A fast parallel algorithm for solving block-tridiagonal systems of linear equations including the domain decomposition method. 245-258 - Christian Obrecht, Frédéric Kuznik, Bernard Tourancheau, Jean-Jacques Roux:
Scalable lattice Boltzmann solvers for CUDA GPU clusters. 259-270 - Yuefan Deng, Peng Zhang, Carlos Marques, Reid Powell, Li Zhang:
Analysis of Linpack and power efficiencies of the world's TOP500 supercomputers. 271-279 - Ichitaro Yamazaki, Hiroto Tadano, Tetsuya Sakurai, Tsutomu Ikegami:
Performance comparison of parallel eigensolvers based on a contour integral method and a Lanczos method. 280-290
Volume 39, Number 8, August 2013
- Yang Wang, Paul Lu:
DDS: A deadlock detection-based scheduling algorithm for workflow computations in HPC systems with storage constraints. 291-305 - A. Sandroos, Ilja Honkonen, Sebastian von Alfthan, Minna Palmroth:
Multi-GPU simulations of Vlasov's equation using Vlasiator. 306-318 - Oliver Fortmeier, H. Martin Bücker, B. O. Fagginger Auer, Rob H. Bisseling:
A new metric enabling an exact hypergraph model for the communication volume in distributed-memory parallel applications. 319-335 - Harald Servat, Germán Llort, Kevin A. Huck, Judit Giménez, Jesús Labarta:
Framework for a productive performance optimization. 336-353
Volume 39, Number 9, September 2013
- Fangyang Shen, Mei Yang, Maurizio Palesi:
Guest Editors' Introduction to the Special Issue on "Novel On-Chip Parallel Architectures and Software Support". 355-356 - Sandeep Pande, Fearghal Morgan, Gerard J. M. Smit, Tom M. Bruintjes, Jochem H. Rutgers, Brian McGinley, Seamus Cawley, Jim Harkin, Liam McDaid:
Fixed latency on-chip interconnect for hardware spiking neural network architectures. 357-371 - Junghee Lee, Chrysostomos Nicopoulos, Hyung Gyu Lee, Jongman Kim:
Sharded Router: A novel on-chip router architecture employing bandwidth sharding and stealing. 372-388 - Michael Opoku Agyeman, Ali Ahmadinia, Alireza Shahrabi:
Efficient routing techniques in heterogeneous 3D Networks-on-Chip. 389-407 - Xiaohang Wang, Peng Liu, Mei Yang, Yingtao Jiang:
Avoiding request-request type message-dependent deadlocks in networks-on-chips. 408-423 - Ashkan Beyranvand Nejad, Anca Mariana Molnos, Matias Escudero Martinez, Kees Goossens:
A hardware/software platform for QoS bridging over multi-chip NoC-based systems. 424-441 - José M. Andión, Manuel Arenaz, Gabriel Rodríguez, Juan Touriño:
A novel compiler support for automatic parallelization on multicore systems. 442-460 - Jiyang Yu, Peng Liu, Weidong Wang, Chunming Huang, Jie Yang, Yingtao Jiang, Qingdong Yao:
An efficient protocol with synchronization accelerator for multi-processor embedded systems. 461-474 - Carlos H. Gonzalez, Basilio B. Fraguela:
A framework for argument-based task synchronization with automatic detection of dependencies. 475-489 - Guiyuan Jiang, Jigang Wu, Jizhou Sun:
Efficient reconfiguration algorithms for communication-aware three-dimensional processor arrays. 490-503 - Giovanni Mariani, Gianluca Palermo, Vittorio Zaccaria, Cristina Silvano:
ARTE: An Application-specific Run-Time managEment framework for multi-cores based on queuing models. 504-519 - Jingweijia Tan, Yang Yi, Fangyang Shen, Xin Fu:
Modeling and characterizing GPGPU reliability in the presence of soft errors. 520-532
Volume 39, Number 10, October 2013
- Marcin Krotkiewski, Marcin Dabrowski:
Efficient 3D stencil computations using CUDA. 533-548 - Jaume Joven, Andrea Marongiu, Federico Angiolini, Luca Benini, Giovanni De Micheli:
An integrated, programming model-driven framework for NoC-QoS support in cluster-based embedded many-cores. 549-566 - Laiping Zhao, Yizhi Ren, Kouichi Sakurai:
Reliable workflow scheduling with less resource redundancy. 567-585 - Libo Huang, Nong Xiao, Zhiying Wang, Yongwen Wang, Ming-che Lai:
Efficient multimedia coprocessor with enhanced SIMD engines for exploiting ILP and DLP. 586-602 - Dimitris Saougkos, George Manis:
Self adaptive run time scheduling for the automatic parallelization of loops with the C2μTC/SL compiler. 603-614 - Agustín C. Caminero, Antonio Robles-Gómez, Salvador Ros, Roberto Hernández, Llanos Tobarra:
P2P-based resource discovery in dynamic grids allowing multi-attribute and range queries. 615-637 - Xiaoliang Wan, Guang Lin:
Hybrid parallel computing of minimum action method. 638-651
Volume 39, Number 11, November 2013
- Gregory Tauer, Rakesh Nagi:
A map-reduce lagrangian heuristic for multidimensional assignment problems with decomposable costs. 653-668 - Gihan R. Mudalige, Mike B. Giles, Jeyarajan Thiyagalingam, I. Z. Reguly, Carlo Bertolli, Paul H. J. Kelly, Anne E. Trefethen:
Design and initial performance of a high-level unstructured mesh framework on heterogeneous parallel systems. 669-692 - Hameed Hussain, Saif Ur Rehman Malik, Abdul Hameed, Samee Ullah Khan, Gage Bickler, Nasro Min-Allah, Muhammad Bilal Qureshi, Limin Zhang, Yongji Wang, Nasir Ghani, Joanna Kolodziej, Albert Y. Zomaya, Cheng-Zhong Xu, Pavan Balaji, Abhinav Vishnu, Frédéric Pinel, Johnatan E. Pecero, Dzmitry Kliazovich, Pascal Bouvry, Hongxiang Li, Lizhe Wang, Dan Chen, Ammar Rayes:
A survey on resource allocation in high performance distributed computing systems. 709-736 - Hoang-Vu Dang, Bertil Schmidt:
CUDA-enabled Sparse Matrix-Vector Multiplication on GPUs using atomic operations. 737-750
Volume 39, Number 12, December 2013
- Yong Chen, Pavan Balaji, Abhinav Vishnu:
Special issue on programming models, systems software, and tools for High-End Computing. 751-752 - Wei Tang, Dongxu Ren, Zhiling Lan, Narayan Desai:
Toward balanced and sustainable job scheduling for production supercomputers. 753-768 - Mark K. Gardner, Paul Sathre, Wu-chun Feng, Gabriel Martinez:
Characterizing the challenges and evaluating the efficacy of a CUDA-to-OpenCL translator. 769-786 - Zhiyi Huang, Kai-Cheung Leung:
Performance evaluation of View-Oriented Transactional Memory. 787-801 - Ekow J. Otoo, Gideon Nimako, Daniel Ohene-Kwofie:
Chunked extendible dense arrays for scientific data storage. 802-818 - Shannon Steinfadt:
Fine-grained parallel implementations for SWAMP+ Smith-Waterman alignment. 819-833 - Jie Shen, Jianbin Fang, Henk J. Sips, Ana Lucia Varbanescu:
An application-centric evaluation of OpenCL on multi-core CPUs. 834-850 - Hisham Mohamed, Stéphane Marchand-Maillet:
MRO-MPI: MapReduce overlapping using MPI and an optimized data exchange policy. 851-866 - Omer Erdil Albayrak, Ismail Akturk, Ozcan Ozturk:
Improving application behavior on heterogeneous manycore systems through kernel mapping. 867-878 - Alexander Reinefeld, Robert Döbbelin, Thorsten Schütt:
Analyzing the performance of SMP memory allocators with iterative MapReduce applications. 879-889
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.