Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Efficient differentiable programming in a functional array-processing language

Published: 26 July 2019 Publication History

Abstract

We present a system for the automatic differentiation (AD) of a higher-order functional array-processing language. The core functional language underlying this system simultaneously supports both source-to-source forward-mode AD and global optimisations such as loop transformations. In combination, gradient computation with forward-mode AD can be as efficient as reverse mode, and that the Jacobian matrices required for numerical algorithms such as Gauss-Newton and Levenberg-Marquardt can be efficiently computed.

Supplementary Material

WEBM File (a97-shaikhha.webm)

References

[1]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. TensorFlow: A System for Large-Scale Machine Learning. In OSDI, Vol. 16. 265–283.
[2]
Sameer Agarwal, Noah Snavely, Steven M Seitz, and Richard Szeliski. 2010. Bundle adjustment in the large. In European conference on computer vision. Springer, 29–42.
[3]
Eugene Agichtein, Eric Brill, and Susan Dumais. 2006. Improving Web Search Ranking by Incorporating User Behavior Information. In SIGIR.
[4]
Johan Anker and Josef Svenningsson. 2013. An EDSL approach to high performance Haskell programming. In ACM Haskell Symposium. 1–12.
[5]
Ricardo Baeza-Yates, Carlos Hurtado, and Marcelo Mendoza. 2004. Query Recommendation Using Query Logs in Search Engines. In EDBT.
[6]
Atilim Gunes Baydin, Barak A Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind. 2015b. Automatic differentiation in machine learning: a survey. arXiv preprint arXiv:1502.05767 (2015).
[7]
Atilim Gunes Baydin, Barak A Pearlmutter, and Jeffrey Mark Siskind. 2015a. Diffsharp: Automatic differentiation library. arXiv preprint arXiv:1511.07727 (2015).
[8]
James Bergstra, Olivier Breuleux, Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, Guillaume Desjardins, Joseph Turian, David Warde-Farley, and Yoshua Bengio. 2010. Theano: A CP U and GP U math compiler in Python. In Proc. 9th Python in Science Conf. 1–7.
[9]
Michael W. Berry, Murray Browne, Amy N. Langville, V. Paul Pauca, and Robert J. Plemmons. 2006. Algorithms and applications for approximate nonnegative matrix factorization. In Computational Statistics and Data Analysis.
[10]
Christian Bischof, Peyvand Khademi, Andrew Mauer, and Alan Carle. 1996. ADIFOR 2.0: Automatic differentiation of Fortran 77 programs. IEEE Computational Science and Engineering 3, 3 (1996), 18–32.
[11]
Christian H Bischof, HM Bucker, Bruno Lang, Arno Rasch, and Andre Vehreschild. 2002. Combining source transformation and operator overloading techniques to compute derivatives for MATLAB programs. In Source Code Analysis and Manipulation, 2002. Proceedings. Second IEEE International Workshop on. IEEE, 65–72.
[12]
Charisee Chiw, Gordon Kindlmann, John Reppy, Lamont Samuels, and Nick Seltzer. 2012. Diderot: A Parallel DSL for Image Analysis and Visualization (PLDI ’12). ACM, 111–120.
[13]
Koen Claessen, Mary Sheeran, and Bo Joel Svensson. 2012. Expressive Array Constructs in an Embedded GP U Kernel Programming Language (DAMP ’12). ACM, NY, USA, 21–30.
[14]
Cliff Click. 1995. Global code motion/global value numbering. ACM Sigplan Notices 30, 6 (1995), 246–257.
[15]
Duncan Coutts, Roman Leshchinskiy, and Don Stewart. 2007. Stream Fusion. From Lists to Streams to Nothing at All (ICFP ’07).
[16]
Olivier Danvy and Andrzej Filinski. 1990. Abstracting control. In Proceedings of the 1990 ACM conference on LISP and functional programming. ACM, 151–160.
[17]
Frédéric De Mesmay, Arpad Rimmel, Yevgen Voronenko, and Markus Püschel. 2009. Bandit-based optimization on graphs with application to library performance tuning. In Proceedings of the 26th Annual International Conference on Machine Learning. ACM, 729–736.
[18]
Zachary DeVito, Michael Mara, Michael Zollhöfer, Gilbert Bernstein, Jonathan Ragan-Kelley, Christian Theobalt, Pat Hanrahan, Matthew Fisher, and Matthias Nießner. 2016. Opt: A Domain Specific Language for Non-linear Least Squares Optimization in Graphics and Imaging. arXiv preprint arXiv:1604.06525 (2016).
[19]
Conal Elliott. 2018. The Simple Essence of Automatic Differentiation. Proc. ACM Program. Lang. 2, ICFP, Article 70 (July 2018), 70:1–70:29 pages.
[20]
Conal M Elliott. 2009. Beautiful differentiation. In ACM Sigplan Notices, Vol. 44. ACM, 191–202.
[21]
Cormac Flanagan, Amr Sabry, Bruce F Duba, and Matthias Felleisen. 1993. The Essence of Compiling with Continuations. In ACM SIGPLAN Notices, Vol. 28. ACM, 237–247.
[22]
Shaun A Forth. 2006. An efficient overloaded implementation of forward mode automatic differentiation in MATLAB. ACM Transactions on Mathematical Software (TOMS) 32, 2 (2006), 195–222.
[23]
Jeremy Gibbons. 2006. Fission for Program Comprehension. In Mathematics of Program Construction, Tarmo Uustalu (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 162–179.
[24]
Andrew Gill, John Launchbury, and Simon L Peyton Jones. 1993. A short cut to deforestation (FPCA). ACM, 223–232.
[25]
Clemens Grelck and Sven-Bodo Scholz. 2006. SAC—A Functional Array Language for Efficient Multi-threaded Execution. Int. Journal of Parallel Programming 34, 4 (2006), 383–427.
[26]
Brian Guenter. 2007. Efficient Symbolic Differentiation for Graphics Applications. ACM Trans. Graph. 26, 3, Article 108 (July 2007).
[27]
Laurent Hascoet and Valérie Pascual. 2013. The Tapenade Automatic Differentiation Tool: Principles, Model, and Specification. ACM Trans. Math. Softw. 39, 3, Article 20 (May 2013), 43 pages.
[28]
Troels Henriksen, Niels GW Serup, Martin Elsman, Fritz Henglein, and Cosmin E Oancea. 2017. Futhark: purely functional GP U-programming with nested parallelism and in-place array updates. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 556–571.
[29]
Florent Hivert and N Thiéry. 2004. MuPAD-Combinat, an open-source package for research in algebraic combinatorics. Sém. Lothar. Combin 51 (2004), 70.
[30]
Robin J. Hogan. 2014. Fast Reverse-Mode Automatic Differentiation Using Expression Templates in C++. ACM Trans. Math. Softw. 40, 4, Article 26 (July 2014), 16 pages.
[31]
Paul Hudak. 1996. Building Domain-specific Embedded Languages. ACM Comput. Surv. 28, 4 (Dec. 1996).
[32]
Graham Hutton. 1999. A tutorial on the universality and expressiveness of fold. Journal of Functional Programming 9, 4 (1999), 355–372.
[33]
Kenneth E Iverson. 1962. A Programming Language. In Proceedings of the May 1-3, 1962, spring joint computer conference. ACM, 345–351.
[34]
Hong-Jiang Zhang Ji-rong Wen, Jian-Yun Nie and. 2002. Query Clustering Using User Logs. ACM Transactions on Information Systems 20, 1 (2002).
[35]
Simon Peyton Jones, Andrew Tolmach, and Tony Hoare. 2001. Playing by the rules: rewriting as a practical optimisation technique in GHC. In Haskell workshop, Vol. 1. 203–233.
[36]
Manohar Jonnalagedda and Sandro Stucki. 2015. Fold-based Fusion As a Library: A Generative Programming Pearl. In Proceedings of the 6th ACM SIGPLAN Symposium on Scala. ACM, 41–50.
[37]
Jerzy Karczmarczuk. 1999. Functional differentiation of computer programs. ACM SIGPLAN Notices 34, 1 (1999), 195–203.
[38]
Kamil A Khan and Paul I Barton. 2015. A vector forward mode of automatic differentiation for generalized derivative evaluation. Optimization Methods and Software 30, 6 (2015), 1185–1212.
[39]
Oleg Kiselyov, Aggelos Biboudis, Nick Palladinos, and Yannis Smaragdakis. 2017. Stream Fusion, to Completeness. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2017). ACM, New York, NY, USA, 285–299.
[40]
Chris Leary and Todd Wang. 2017. XLA: TensorFlow, compiled. TensorFlow Dev Summit (2017).
[41]
Kenneth Levenberg. 1944. A method for the solution of certain non-linear problems in least squares. Quarterly of applied mathematics 2, 2 (1944), 164–168.
[42]
Chao Liu, Hung-chih Yang, Jinliang Fan, Li-Wei He, and Yi-Min Wang. 2010. Distributed nonnegative matrix factorization for web-scale dyadic data analysis on mapreduce. In Proceedings of the 19th international conference on World wide web. ACM, 681–690.
[43]
Dougal Maclaurin, David Duvenaud, and Ryan P Adams. 2015. Autograd: Effortless gradients in numpy. In ICML 2015 AutoML Workshop.
[44]
Donald W Marquardt. 1963. An algorithm for least-squares estimation of nonlinear parameters. Journal of the society for Industrial and Applied Mathematics 11, 2 (1963), 431–441.
[45]
Jorge J Moré. 1978. The Levenberg-Marquardt algorithm: implementation and theory. In Numerical analysis. Springer, 105–116.
[46]
Sri Hari Krishna Narayanan, Boyana Norris, and Beata Winnicka. 2010. ADIC2: Development of a component source transformation system for differentiating C and C++. Procedia Computer Science 1, 1 (2010), 1845–1853.
[47]
Peter Norvig. 1992. Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp. Morgan Kaufmann.
[48]
Lionel Parreaux, Amir Shaikhha, and Christoph E. Koch. 2017a. Quoted Staged Rewriting: A Practical Approach to Librarydefined Optimizations. In Proceedings of the 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE 2017). ACM, New York, NY, USA, 131–145.
[49]
Lionel Parreaux, Antoine Voizard, Amir Shaikhha, and Christoph E. Koch. 2017b. Unifying Analytic and Statically-typed Quasiquotes. Proc. ACM Program. Lang. 2, POPL, Article 13 (Dec. 2017), 33 pages.
[50]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. (2017).
[51]
Barak A Pearlmutter and Jeffrey Mark Siskind. 2007. Lazy multivariate higher-order forward-mode AD. In ACM SIGPLAN Notices, Vol. 42. ACM, 155–160.
[52]
Barak A Pearlmutter and Jeffrey Mark Siskind. 2008. Reverse-mode AD in a functional framework: Lambda the ultimate backpropagator. ACM Transactions on Programming Languages and Systems (TOPLAS) 30, 2 (2008), 7.
[53]
Markus Puschel, José MF Moura, Jeremy R Johnson, David Padua, Manuela M Veloso, Bryan W Singer, Jianxin Xiong, Franz Franchetti, Aca Gacic, Yevgen Voronenko, et al. 2005. SPIRAL: Code generation for DSP transforms. Proc. IEEE 93, 2 (2005), 232–275.
[54]
Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines (PLDI ’13).
[55]
Jarrett Revels, Miles Lubin, and Theodore Papamarkou. 2016. Forward-mode automatic differentiation in Julia. arXiv preprint arXiv:1607.07892 (2016).
[56]
Tiark Rompf and Martin Odersky. 2010. Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs. In the ninth international conference on Generative programming and component engineering (GPCE ’10). ACM, New York, NY, USA, 127–136.
[57]
Tiark Rompf, Arvind K. Sujeeth, Nada Amin, Kevin J. Brown, Vojin Jovanovic, HyoukJoong Lee, Manohar Jonnalagedda, Kunle Olukotun, and Martin Odersky. 2013. Optimizing Data Structures in High-level Programs: New Directions for Extensible Compilers Based on Staging. In Proceedings of the 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’13). ACM, New York, NY, USA, 497–510.
[58]
Nadav Rotem, Jordan Fix, Saleem Abdulrasool, Garret Catron, Summer Deng, Roman Dzhabarov, Nick Gibson, James Hegeman, Meghan Lele, Roman Levenstein, et al. 2018. Glow: Graph lowering compiler techniques for neural networks. arXiv preprint arXiv:1805.00907 (2018).
[59]
Amir Shaikhha, Mohammad Dashti, and Christoph Koch. 2018. Push versus Pull-Based Loop Fusion in Query Engines. Journal of Functional Programming 28 (2018), e10.
[60]
Amir Shaikhha, Andrew Fitzgibbon, Simon Peyton Jones, and Dimitrios Vytiniotis. 2017. Destination-passing Style for Efficient Memory Management. In Proceedings of the 6th ACM SIGPLAN International Workshop on Functional HighPerformance Computing (FHPC 2017). ACM, New York, NY, USA, 12–23.
[61]
Amir Shaikhha and Lionel Parreaux. 2019. Finally, a Polymorphic Linear Algebra Language. In Proceedings of the 33rd European Conference on Object-Oriented Programming (ECOOP’19).
[62]
Jeffrey Mark Siskind and Barak A Pearlmutter. 2005. Perturbation confusion and referential transparency: Correct functional implementation of forward-mode AD. (2005).
[63]
Jeffrey Mark Siskind and Barak A Pearlmutter. 2008. Nesting forward-mode AD in a functional framework. Higher-Order and Symbolic Computation 21, 4 (2008), 361–376.
[64]
Daniele G Spampinato and Markus Püschel. 2016. A basic linear algebra compiler for structured matrices. In CGO ’16. ACM.
[65]
Suvrit Sra and Inderjit S. Dhillon. 2006. Nonnegative matrix approximation: algorithms and applications. Technical Report.
[66]
Michel Steuwer, Christian Fensch, Sam Lindley, and Christophe Dubach. 2015. Generating Performance Portable Code Using Rewrite Rules: From High-level Functional Expressions to High-performance OpenCL Code. In Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming (ICFP 2015). ACM, New York, NY, USA, 205–217.
[67]
Arvind Sujeeth, HyoukJoong Lee, Kevin Brown, Tiark Rompf, Hassan Chafi, Michael Wu, Anand Atreya, Martin Odersky, and Kunle Olukotun. 2011. OptiML: An Implicitly Parallel Domain-Specific Language for Machine Learning (ICML ’11). 609–616.
[68]
Josef Svenningsson. 2002. Shortcut Fusion for Accumulating Parameters & Zip-like Functions (ICFP ’02). ACM, 124–132.
[69]
Bo Joel Svensson and Josef Svenningsson. 2014. Defunctionalizing Push Arrays (FHPC ’14). ACM, NY, USA, 43–52.
[70]
Walid Taha and Tim Sheard. 2000. MetaML and multi-stage programming with explicit annotations. Theor. Comput. Sci. 248, 1-2 (2000), 211–242.
[71]
Bill Triggs, Philip F McLauchlan, Richard I Hartley, and Andrew W Fitzgibbon. 1999. Bundle adjustment—a modern synthesis. In Inter. workshop on vision algorithms. Springer, 298–372.
[72]
Philip Wadler. 1988. Deforestation: Transforming programs to eliminate trees. In ESOP’88. Springer, 344–358.
[73]
Fei Wang, James Decker, Xilun Wu, Gregory Essertel, and Tiark Rompf. 2018. Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming. In Advances in Neural Information Processing Systems. 10200– 10211.
[74]
Matthew J Weinstein and Anil V Rao. 2016. Algorithm: ADiGator, a toolbox for the algorithmic differentiation of mathematical functions in MATLAB using source transformation via operator overloading. ACM Trans. Math. Softw (2016).
[75]
Christopher Zach. 2014. Robust bundle adjustment revisited. In European Conference on Computer Vision. Springer, 772–787.

Cited By

View all
  • (2024)Restaging Domain-Specific Languages: A Flexible Design Pattern for Rapid Development of Optimizing CompilersProceedings of the 23rd ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3689484.3690739(80-93)Online publication date: 21-Oct-2024
  • (2024)Efficient CHADProceedings of the ACM on Programming Languages10.1145/36328788:POPL(1060-1088)Online publication date: 5-Jan-2024
  • (2024)TapeFlow: Streaming Gradient Tapes in Automatic DifferentiationProceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO57630.2024.10444805(81-92)Online publication date: 2-Mar-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 3, Issue ICFP
August 2019
1054 pages
EISSN:2475-1421
DOI:10.1145/3352468
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 July 2019
Published in PACMPL Volume 3, Issue ICFP

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Code Motion
  2. Differentiable Programming
  3. Linear Algebra
  4. Loop Fusion
  5. Optimising Compilers

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)286
  • Downloads (Last 6 weeks)20
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Restaging Domain-Specific Languages: A Flexible Design Pattern for Rapid Development of Optimizing CompilersProceedings of the 23rd ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3689484.3690739(80-93)Online publication date: 21-Oct-2024
  • (2024)Efficient CHADProceedings of the ACM on Programming Languages10.1145/36328788:POPL(1060-1088)Online publication date: 5-Jan-2024
  • (2024)TapeFlow: Streaming Gradient Tapes in Automatic DifferentiationProceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO57630.2024.10444805(81-92)Online publication date: 2-Mar-2024
  • (2024)A Tensor Algebra Compiler for Sparse DifferentiationProceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO57630.2024.10444787(1-12)Online publication date: 2-Mar-2024
  • (2024)Automatic differentiation for ML-family languages: Correctness via logical relationsMathematical Structures in Computer Science10.1017/S0960129524000215(1-60)Online publication date: 21-Oct-2024
  • (2023)Auto-differentiation of relational computations for very large scale machine learningProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619806(33581-33598)Online publication date: 23-Jul-2023
  • (2023)Reverse-Mode AD of Multi-Reduce and Scan in FutharkProceedings of the 35th Symposium on Implementation and Application of Functional Languages10.1145/3652561.3652575(1-14)Online publication date: 29-Aug-2023
  • (2023)Partial Evaluation of Automatic Differentiation for Differential-Algebraic Equations SolversProceedings of the 22nd ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3624007.3624054(57-71)Online publication date: 22-Oct-2023
  • (2023)Graph IRs for Impure Higher-Order Languages: Making Aggressive Optimizations Affordable with Precise Effect DependenciesProceedings of the ACM on Programming Languages10.1145/36228137:OOPSLA2(400-430)Online publication date: 16-Oct-2023
  • (2023)Using Rewrite Strategies for Efficient Functional Automatic DifferentiationProceedings of the 25th ACM International Workshop on Formal Techniques for Java-like Programs10.1145/3605156.3606456(51-57)Online publication date: 18-Jul-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media