Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2254064.2254080acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

Synthesising graphics card programs from DSLs

Published: 11 June 2012 Publication History

Abstract

Over the last five years, graphics cards have become a tempting target for scientific computing, thanks to unrivaled peak performance, often producing a runtime speed-up of x10 to x25 over comparable CPU solutions.
However, this increase can be difficult to achieve, and doing so often requires a fundamental rethink. This is especially problematic in scientific computing, where experts do not want to learn yet another architecture.
In this paper we develop a method for automatically parallelising recursive functions of the sort found in scientific papers. Using a static analysis of the function dependencies we identify sets - partitions - of independent elements, which we use to synthesise an efficient GPU implementation using polyhedral code generation techniques. We then augment our language with DSL extensions to support a wider variety of applications, and demonstrate the effectiveness of this with three case studies, showing significant performance improvement over equivalent CPU methods, and similar efficiency to hand-tuned GPU implementations.

References

[1]
C. Bastoul. Code generation in the polyhedral model is easier than you think. In PACT'13 IEEE International Conference on Parallel Architecture and Compilation Techniques, pages 7--16, Juan-les-Pins, France, September 2004.
[2]
H. Chafi, Z. DeVito, A. Moors, T. Rompf, A. K. Sujeeth, P. Hanrahan, M. Odersky, and K. Olukotun. Language virtualization for heterogeneous parallel computing. In OOPSLA '10: Proceedings of the ACM international conference on Object oriented programming systems languages and applications, pages 835--847, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0203-6.
[3]
R. Durbin, S. R. Eddy, A. Krogh, and G. Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, July 1999. ISBN 0521629713.
[4]
S. Eddy. HMMer Website, including User Manual. http://hmmer.wustl.edu.
[5]
C. Elliott. Programming graphics processors functionally. In Proceedings of the 2004 ACM SIGPLAN workshop on Haskell, Haskell '04, pages 45--56, New York, NY, USA, 2004. ACM. ISBN 1-58113-850-4. http://doi.acm.org/10.1145/1017472.1017482.
[6]
R. Giegerich and C. Meyer. Algebraic dynamic programming. In AMAST '02: Proceedings of the 9th International Conference on Algebraic Methodology and Software Technology, pages 349--364, London, UK, 2002. Springer-Verlag. ISBN 3-540-44144-1.
[7]
J. A. Gunnels, F. G. Gustavson, G. M. Henry, and R. A. van de Geijn. FLAME: Formal Linear Algebra Methods Environment. ACM Transactions on Mathematical Software, 27 (4): 422--455, Dec. 2001. ISSN 0098-3500. URL http://doi.acm.org/10.1145/504210.504213.
[8]
S. L. P. Jones and S. Singh. A tutorial on parallel and concurrent programming in haskell. In P. W. M. Koopman, R. Plasmeijer, and S. D. Swierstra, editors, Advanced Functional Programming, volume 5832 of Lecture Notes in Computer Science, pages 267--305. Springer, 2008. ISBN 978-3-642-04651-3.
[9]
R. M. Karp and M. Held. Finite-state processes and dynamic programming. SIAM Journal on Applied Mathematics, 15 (3): 693--718, 1967. 10.1137/0115060.
[10]
A. Krogh, I. S. Mian, and D. Haussler. A hidden markov model that finds genes in e.coli dna. Nucleic Acids Research, 22 (22): 4768--4778, 1994. 10.1093/nar/22.22.4768.
[11]
L. Lamport. The parallel execution of do loops. Communications of The ACM, 17: 83--93, February 1974. 10.1145/360827.360844.
[12]
C. Lengauer. Loop parallelization in the polytope model. In CONCUR '93, Lecture Notes in Computer Science 715, pages 398--416. Springer-Verlag, 1993.
[13]
Y. Liu, B. Schmidt, and D. Maskell. Cudasw 2.0: enhanced smith-waterman protein database search on cuda-enabled gpus based on simt and virtualized simd abstractions. BMC Research Notes, 3 (1): 93, 2010. ISSN 1756-0500. 10.1186/1756-0500-3-93.
[14]
G. Lunter. HMMoC a compiler for hidden Markov models. Bioinformatics, 23 (18): 2485--2487, September 2007. 10.1093/bioinformatics/btm350.
[15]
G. Mainland and G. Morrisett. Nikola: embedding compiled gpu functions in haskell. In Haskell '10: Proceedings of the third ACM Haskell symposium on Haskell, pages 67--78, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0252-4.
[16]
W. R. Pearson and D. J. Lipman. Improved tools for biological sequence comparison. Proceedings of The National Academy of Sciences, 85: 2444--2448, 1988.
[17]
P. Steffen, R. Giegerich, and M. Giraud. Gpu parallelization of algebraic dynamic programming. 2009. URL HAL:http://hal.archives-ouvertes.fr/inria-00438219/en/.
[18]
J. Svensson, K. Claessen, and M. Sheeran. Gpgpu kernel implementation and refinement using obsidian. Procedia Computer Science, 1 (1): 2065--2074, 2010. ISSN 1877-0509. ICCS 2010.
[19]
A. van Deursen, P. Klint, and J. Visser. Domain-specific languages: an annotated bibliography. SIGPLAN Not., 35: 26--36, June 2000.
[20]
H. Verge, C. Mauras, and P. Quinton. The alpha language and its use for the design of systolic arrays. The Journal of VLSI Signal Processing, 3: 173--182, 1991. ISSN 0922-5773. URL http://dx.doi.org/10.1007/BF00925828. 10.1007/BF00925828.
[21]
J. P. Walters, V. Balu, S. Kompalli, and V. Chaudhary. Evaluating the use of gpus in liver image segmentation and hmmer database searches. In IPDPS '09: Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing, pages 1--12, Washington, DC, USA, 2009. IEEE Computer Society. ISBN 978-1-4244-3751-1.

Cited By

View all
  • (2024)Trellis: A Domain-Specific Language for Hidden Markov Models with Sparse TransitionsProceedings of the 17th ACM SIGPLAN International Conference on Software Language Engineering10.1145/3687997.3695641(196-209)Online publication date: 17-Oct-2024
  • (2015)Stepwise-refinement for performanceConcurrency and Computation: Practice & Experience10.1002/cpe.341627:17(4515-4554)Online publication date: 10-Dec-2015
  • (2014)Staged parser combinators for efficient data processingACM SIGPLAN Notices10.1145/2714064.266024149:10(637-653)Online publication date: 15-Oct-2014
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PLDI '12: Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation
June 2012
572 pages
ISBN:9781450312059
DOI:10.1145/2254064
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 47, Issue 6
    PLDI '12
    June 2012
    534 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/2345156
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dynamic programming
  2. gpu
  3. program synthesis
  4. scientific applications

Qualifiers

  • Research-article

Conference

PLDI '12
Sponsor:

Acceptance Rates

PLDI '12 Paper Acceptance Rate 48 of 255 submissions, 19%;
Overall Acceptance Rate 406 of 2,067 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)2
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Trellis: A Domain-Specific Language for Hidden Markov Models with Sparse TransitionsProceedings of the 17th ACM SIGPLAN International Conference on Software Language Engineering10.1145/3687997.3695641(196-209)Online publication date: 17-Oct-2024
  • (2015)Stepwise-refinement for performanceConcurrency and Computation: Practice & Experience10.1002/cpe.341627:17(4515-4554)Online publication date: 10-Dec-2015
  • (2014)Staged parser combinators for efficient data processingACM SIGPLAN Notices10.1145/2714064.266024149:10(637-653)Online publication date: 15-Oct-2014
  • (2014)Staged parser combinators for efficient data processingProceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications10.1145/2660193.2660241(637-653)Online publication date: 15-Oct-2014
  • (2014)A Module System for Domain-Specific LanguagesTheory and Practice of Logic Programming10.1017/S147106841400033714:4-5(771-785)Online publication date: 21-Jul-2014

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media