Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3130379.3130473guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article
Free access

Towards exascale computing with heterogeneous architectures

Published: 27 March 2017 Publication History

Abstract

The goal of reaching exascale computing is made especially challenging by the highly heterogeneous nature of modern platforms and the energy they consume. As compute nodes typically utilize multiple multi-core CPU and are increasingly equipped with PCIe based accelerators, both are contributing to an ever more dynamic power consumption. In our study we evaluate our target application on a variety of heterogeneous platforms, including high end FPGA, GPU, and Xeon Phi accelerators, with respect to energy efficiency at a node and cluster level. We compare multiple implementations of our application, each built with a different modern parallel programming framework, with respect to execution performance, code complexity and energy efficiency. Later we extrapolate based on our findings, the implications of scaling this application towards exascale, with projections of computation achievable within the exascale power budget for our three architectures.

References

[1]
A. Sodani and C. Processor, "Race to exascale: Opportunities and challenges," in Keynote at the Annual IEEE/ACM 44th Annual International Symposium on Microarchitecture, 2011.
[2]
Kusnezov, Binkley, Harrod, and Meisner. (2013, Sep.) Doe exascale initiative. {Online}. Available: http://energy.gov/sites/prod/files/2013/09/f2/20130913-SEAB-DOE-Exascale-Initiative.pdf
[3]
K. Bergman, S. Borkar, D. Campbell, W. Carlson, W. Dally, M. Denneau, P. Franzon, W. Harrod, K. Hill, J. Hiller et al., "Exascale computing study: Technology challenges in achieving exascale systems," Defense Advanced Research Projects Agency Information Processing Techniques Office (DARPA IPTO), Tech. Rep, vol. 15, 2008.
[4]
A. Putnam, A. M. Caulfield, E. S. Chung, D. Chiou, K. Constantinides, J. Demme, H. Esmaeilzadeh, J. Fowers, G. P. Gopal, J. Gray et al., "A reconfigurable fabric for accelerating large-scale datacenter services," in 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA). IEEE, 2014, pp. 13--24.
[5]
J. Fowers, G. Brown, P. Cooke, and G. Stitt, "A performance and energy comparison of fpgas, gpus, and multicores for sliding-window applications," in Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays. ACM, 2012, pp. 47--56.
[6]
E. Nurvitadhi, G. Weisz, Y. Wang, S. Hurkat, M. Nguyen, J. C. Hoe, J. F. Martínez, and C. Guestrin, "Graphgen: An fpga framework for vertex-centric graph computation," in Field-Programmable Custom Computing Machines (FCCM), 2014 IEEE 22nd Annual International Symposium on. IEEE, 2014, pp. 25--28.
[7]
D. Sidler, G. Alonso, M. Blott, K. Karras, K. Vissers, and R. Carley, "Scalable 10gbps tcp/ip stack architecture for reconfigurable hardware," in Field-Programmable Custom Computing Machines (FCCM), 2015 IEEE 23rd Annual International Symposium on. IEEE, 2015, pp. 36--43.
[8]
C. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao, and J. Cong, "Optimizing fpga-based accelerator design for deep convolutional neural networks," in Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 2015, pp. 161--170.
[9]
G. Guidi, E. Reggiani, L. D. Tucci, G. Durelli, M. Blott, and M. D. Santambrogio, "On how to improve fpga-based systems design productivity via sdaccel," in 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), May 2016, pp. 247--252.
[10]
T. Smith and M. Waterman, "Identification of common molecular subsequences," Journal of Molecular Biology, vol. 147, no. 1, pp. 195 -- 197, 1981. {Online}. Available: http://www.sciencedirect.com/science/article/pii/0022283681900875
[11]
S. Xu, D. A. Smith, A. Mullen, and R. Cordell, "Detecting and evaluating local text reuse in social networks," ACL 2014, p. 50, 2014.
[12]
Y. Liu and B. Schmidt, "Swaphi: Smith-waterman protein database search on xeon phi coprocessors," in 2014 IEEE 25th International Conference on Application-Specific Systems, Architectures and Processors, June 2014, pp. 184--185.
[13]
Y. Liu, T. T. Tran, F. Lauenroth, and B. Schmidt, "Swaphi-ls: Smith-waterman algorithm on xeon phi coprocessors for long dna sequences," in 2014 IEEE International Conference on Cluster Computing (CLUSTER), Sept 2014, pp. 257--265.
[14]
Y. Liu, A. Wirawan, and B. Schmidt, "Cudasw++ 3.0: accelerating smith-waterman protein database search by coupling cpu and gpu simd instructions," BMC Bioinformatics, vol. 14, no. 1, p. 117, 2013. {Online}. Available
[15]
M. Korpar and M. iki, "Sw#gpu-enabled exact alignments on genome scale," Bioinformatics, vol. 29, no. 19, pp. 2494--2495, 2013. {Online}. Available: http://bioinformatics.oxfordjournals.org/content/29/19/2494.abstract
[16]
L. Di Tucci, K. O'Brien, M. Blott, and M. D. Santambrogio, "Achitectural Optimizations for High Performance and Energy Efficient Smith-Waterman Implementation on FPGAs using OpenCL," in 2017 Design, Automation and Test in Europe. IEEE, Accepted to appear, 2016.
[17]
E. Roberts. (2016, Nov.) Smith-waterman algorithm. {Online}. Available: https://cs.stanford.edu/people/eroberts/courses/soco/projects/computers-and-the-hgp/smith_waterman.html
[18]
M. Korpar, M. Sosic, D. Blazeka, and M. Sikic, "Sw#db: Gpu-accelerated exact sequence similarity database search," PLOS ONE, vol. 10, no. 12, pp. 1--11, 12 2016. {Online}. Available:
[19]
S. Muralidharan, K. O'Brien, and C. Lalanne, "A semi-automated tool flow for roofline anaylsis of opencl kernels on accelerators," First International Workshop on Heterogeneous High-performance Reconfigurable Computing (H2RC '15), 2015.
[20]
J. L. Wegrzyn, J. D. Liechty, K. A. Stevens, L.-S. Wu, C. A. Loopstra, H. A. Vasquez-Gross, W. M. Dougherty, B. Y. Lin, J. J. Zieve, P. J. Martínez-García, C. Holt, M. Yandell, A. V. Zimin, J. A. Yorke, M. W. Crepeau, D. Puiu, S. L. Salzberg, P. J. de Jong, K. Mockaitis, D. Main, C. H. Langley, and D. B. Neale, "Unique features of the loblolly pine (pinus taeda 1.) megagenome revealed through sequence annotation," Genetics, vol. 196, no. 3, pp. 891--909, 2014. {Online}. Available: http://www.genetics.org/content/196/3/891

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
DATE '17: Proceedings of the Conference on Design, Automation & Test in Europe
March 2017
1814 pages

Publisher

European Design and Automation Association

Leuven, Belgium

Publication History

Published: 27 March 2017

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 46
    Total Downloads
  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)5
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media