Ozog D, Rahman M, Taylor G and Dinan J. (2019). Designing, Implementing, and Evaluating the Upcoming OpenSHMEM Teams API 2019 IEEE/ACM Parallel Applications Workshop, Alternatives To MPI (PAW-ATM). 10.1109/PAW-ATM49560.2019.00009. 978-1-7281-5979-9. (37-46).

https://ieeexplore.ieee.org/document/9062752/

Kutil R. (2016). Towards an object oriented programming framework for parallel matrix algorithms 2016 International Conference on High Performance Computing & Simulation (HPCS). 10.1109/HPCSim.2016.7568413. 978-1-5090-2088-1. (776-783).

http://ieeexplore.ieee.org/document/7568413/

Kumar S, Mamidala A, Heidelberger P, Chen D and Faraj D. (2014). Optimization of MPI collective operations on the IBM Blue Gene/Q supercomputer. The International Journal of High Performance Computing Applications. 10.1177/1094342014552086. 28:4. (450-464). Online publication date: 1-Nov-2014.

http://journals.sagepub.com/doi/10.1177/1094342014552086

Kumar S and Blocksome M. Scalable MPI-3.0 RMA on the Blue Gene/Q Supercomputer. Proceedings of the 21st European MPI Users' Group Meeting. (7-12).

https://doi.org/10.1145/2642769.2642778

Chavarría-Miranda D, Agarwal K and Straatsma T. Scalable PGAS metadata management on extreme scale systems. Proceedings of the 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing. (103-111).

https://doi.org/10.1109/CCGrid.2013.83

Teijeiro C, Taboada G, Touriño J, Doallo R, Mouriño J, Mallón D and Wibecan B. (2013). Design and Implementation of an Extended Collectives Library for Unified Parallel C. Journal of Computer Science and Technology. 10.1007/s11390-013-1313-9. 28:1. (72-89). Online publication date: 1-Jan-2013.

http://link.springer.com/10.1007/s11390-013-1313-9

Spafford K and Vetter J. Aspen. Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. (1-11).

/doi/10.5555/2388996.2389110

Alvanos M, Farreras M, Tiotto E and Martorell X. Automatic communication coalescing for irregular computations in UPC language. Proceedings of the 2012 Conference of the Center for Advanced Studies on Collaborative Research. (220-234).

/doi/10.5555/2399776.2399796

González-Domínguez J, Martín M, Taboada G and Touriño J. (2011). Dense Triangular Solvers on Multicore Clusters using UPC. Procedia Computer Science. 10.1016/j.procs.2011.04.025. 4. (231-240).

https://linkinghub.elsevier.com/retrieve/pii/S1877050911000834

Chen L, Liu L, Tang S, Huang L, Jing Z, Xu S, Zhang D and Shou B. (2011). Unified Parallel C for GPU Clusters: Language Extensions and Compiler Implementation. Languages and Compilers for Parallel Computing. 10.1007/978-3-642-19595-2_11. (151-165).

http://link.springer.com/10.1007/978-3-642-19595-2_11

Blagojević F, Hargrove P, Iancu C and Yelick K. Hybrid PGAS runtime support for multicore nodes. Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model. (1-10).

https://doi.org/10.1145/2020373.2020376

Chen L, Liu L, Tang S, Huang L, Jing Z, Xu S, Zhang D and Shou B. Unified parallel C for GPU clusters. Proceedings of the 23rd international conference on Languages and compilers for parallel computing. (151-165).

/doi/10.5555/1964536.1964547

Gahvari H and Gropp W. (2010). An introductory exascale feasibility study for FFTs and multigrid 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS). 10.1109/IPDPS.2010.5470417. 978-1-4244-6442-5. (1-9).

http://ieeexplore.ieee.org/document/5470417/

Teijeiro C, Taboada G, Touriño J, Fraguela B, Doallo R, Mallón D, Gómez A, Mouriño J and Wibecan B. Evaluation of UPC programmability using classroom studies. Proceedings of the Third Conference on Partitioned Global Address Space Programing Models. (1-7).

https://doi.org/10.1145/1809961.1809975

Mallón D, Gómez A, Mouriño J, Taboada G, Teijeiro C, Touriño J, Fraguela B, Doallo R and Wibecan B. UPC performance evaluation on a multicore system. Proceedings of the Third Conference on Partitioned Global Address Space Programing Models. (1-7).

https://doi.org/10.1145/1809961.1809974

González-Domínguez J, Martín M, Taboada G, Touriño J, Doallo R and Gómez A. A Parallel Numerical Library for UPC. Proceedings of the 15th International Euro-Par Conference on Parallel Processing. (630-641).

https://doi.org/10.1007/978-3-642-03869-3_60

Kumar S, Dozsa G, Berg J, Cernohous B, Miller D, Ratterman J, Smith B and Heidelberger P. Architecture of the Component Collective Messaging Interface. Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface. (23-32).

https://doi.org/10.1007/978-3-540-87475-1_10