Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2884781.2884850acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Public Access

Floating-point precision tuning using blame analysis

Published: 14 May 2016 Publication History

Abstract

While tremendously useful, automated techniques for tuning the precision of floating-point programs face important scalability challenges. We present Blame Analysis, a novel dynamic approach that speeds up precision tuning. Blame Analysis performs floating-point instructions using different levels of accuracy for their operands. The analysis determines the precision of all operands such that a given precision is achieved in the final result of the program. Our evaluation on ten scientific programs shows that Blame Analysis is successful in lowering operand precision. As it executes the program only once, the analysis is particularly useful when targeting reductions in execution time. In such case, the analysis needs to be combined with search-based tools such as Precimonious. Our experiments show that combining Blame Analysis with Precimonious leads to obtaining better results with significant reduction in analysis time: the optimized programs execute faster (in three cases, we observe as high as 39.9% program speedup) and the combined analysis time is 9x faster on average, and up to 38x faster than Precimonious alone.

References

[1]
D. An, R. Blue, M. Lam, S. Piper, and G. Stoker. Fpinst: Floating point error analysis using dyninst, 2008.
[2]
H. Anzt, D. Lukarski, S. Tomov, and J. Dongarra. Self-adaptive multiprecision preconditioners on multicore and manycore architectures. Proceedings of 11th International Meeting High Performance Computing for Computational Science, VECPAR, 2014.
[3]
M. Baboulin, A. Buttari, J. Dongarra, J. Kurzak, J. Langou, J. Langou, P. Luszczek, and S. Tomov. Accelerating scientific computations with mixed precision algorithms. Computer Physics Communications, 180(12):2526--2533, 2009.
[4]
D. H. Bailey. High-precision floating-point arithmetic in scientific computation. Computing in Science and Engg., 7(3):54--61, May 2005.
[5]
D. H. Bailey and J. M. Borwein. High-precision arithmetic: Progress and challenges.
[6]
T. Bao and X. Zhang. On-the-fly detection of instability problems in floating-point program execution. In OOPSLA'13, pages 817--832, 2013.
[7]
E. T. Barr, T. Vo, V. Le, and Z. Su. Automatic detection of floating-point exceptions. In R. Giacobazzi and R. Cousot, editors, The 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '13, Rome, Italy - January 23-25, 2013, pages 549--560. ACM, 2013.
[8]
F. Benz, A. Hildebrandt, and S. Hack. A dynamic program analysis to find floating-point accuracy problems. In J. Vitek, H. Lin, and F. Tip, editors, ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '12, Beijing, China - June 11-16, 2012, pages 453--462. ACM, 2012.
[9]
J. Bilmes, K. Asanović, C. Chin, and J. Demmel. Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology. In Proceedings of the International Conference on Supercomputing, Vienna, Austria, July 1997. ACM SIGARC. see http://www.icsi.berkeley.edu/~bilmes/phipac.
[10]
B. Boston, A. Sampson, D. Grossman, and L. Ceze. Probability Type Inference for Flexible Approximate Programming. To appear in OOPSLA, 2015.
[11]
A. W. Brown, P. H. J. Kelly, and W. Luk. Profiling floating point value ranges for reconfigurable implementation. In Proceedings of the 1st HiPEAC Workshop on Reconfigurable Computing, pages 6--16, 2007.
[12]
D. Bruening, T. Garnett, and S. Amarasinghe. An infrastructure for adaptive dynamic optimization. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization, 2003.
[13]
A. Buttari, J. Dongarra, J. Kurzak, P. Luszczek, and S. Tomov. Using mixed precision for sparse matrix computations to enhance the performance while achieving 64-bit accuracy. ACM Trans. Math. Softw., 34(4):17:1--17:22, July 2008.
[14]
A. Buttari, J. Dongarra, J. Langou, J. Langou, P. Luszczek, and J. Kurzak. Mixed precision iterative refinement techniques for the solution of dense linear systems. Int. J. High Perform. Comput. Appl., 21(4), Nov. 2007.
[15]
M. Carbin, S. Misailovic, and M. C. Rinard. Verifying quantitative reliability for programs that execute on unreliable hardware. In A. L. Hosking, P. T. Eugster, and C. V. Lopes, editors, Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA 2013, part of SPLASH 2013, Indianapolis, IN, USA, October 26-31, 2013, pages 33--52. ACM, 2013.
[16]
W. Chiang, G. Gopalakrishnan, Z. Rakamaric, and A. Solovyev. Efficient search for inputs causing high floating-point errors. In J. Moreira and J. R. Larus, editors, ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '14, Orlando, FL, USA, February 15-19, 2014, pages 43--52. ACM, 2014.
[17]
G. P. Contributors. GSL - GNU scientific library - GNU project - free software foundation (FSF). http://www.gnu.org/software/gsl/, 2010.
[18]
E. Darulova and V. Kuncak. Trustworthy numerical computation in scala. In C. V. Lopes and K. Fisher, editors, Proceedings of the 26th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2011, part of SPLASH 2011, Portland, OR, USA, October 22-27, 2011, pages 325--344. ACM, 2011.
[19]
E. Darulova and V. Kuncak. Sound compilation of reals. In S. Jagannathan and P. Sewell, editors, The 41st Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '14, San Diego, CA, USA, January 20-21, 2014, pages 235--248. ACM, 2014.
[20]
F. de Dinechin, C. Q. Lauter, and G. Melquiond. Certifying the floating-point implementation of an elementary function using gappa. pages 242--253, 2011.
[21]
D. Delmas, E. Goubault, S. Putot, J. Souyris, K. Tekkal, and F. Védrine. Towards an industrial use of FLUCTUAT on safety-critical avionics software. In M. Alpuente, B. Cook, and C. Joubert, editors, Formal Methods for Industrial Critical Systems, 14th International Workshop, FMICS 2009, Eindhoven, The Netherlands, November 2-3, 2009. Proceedings, volume 5825 of Lecture Notes in Computer Science, pages 53--69. Springer, 2009.
[22]
M. Frigo. A Fast Fourier Transform compiler. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Atlanta, Georgia, May 1999.
[23]
Y. He and C. H. Q. Ding. Using accurate arithmetics to improve numerical reproducibility and stability in parallel applications. Journal of Supercomputing, 18:259--277, 2001.
[24]
Y. Hida, X. S. Li, and D. H. Bailey. Algorithms for quad-double precision floating point arithmetic. In ARITH'01, 2001.
[25]
M. O. Lam, J. K. Hollingsworth, B. R. de Supinski, and M. P. LeGendre. Automatically adapting programs for mixed-precision floating-point computation. In ICS'13, 2013.
[26]
M. O. Lam, J. K. Hollingsworth, and G. W. Stewart. Dynamic floating-point cancellation detection. Parallel Computing, 39(3):146--155, 2013.
[27]
C. Lattner and V. S. Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO'04, pages 75--88, 2004.
[28]
X. S. Li, J. W. Demmel, D. H. Bailey, G. Henry, Y. Hida, J. Iskandar, W. Kahan, S. Y. Kang, A. Kapur, M. C. Martin, B. J. Thompson, T. Tung, and D. J. Yoo. Design, implementation and testing of extended and mixed precision BLAS. ACM Trans. Math. Softw., 28(2):152--205, June 2002.
[29]
S. Misailovic, M. Carbin, S. Achour, Z. Qi, and M. C. Rinard. Chisel: reliability- and accuracy-aware optimization of approximate computational kernels. In A. P. Black and T. D. Millstein, editors, Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA 2014, part of SPLASH 2014, Portland, OR, USA, October 20-24, 2014, pages 309--328. ACM, 2014.
[30]
N. Nethercote and J. Seward. Valgrind: A framework for heavyweight dynamic binary instrumentation. In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '07, 2007.
[31]
P. Panchekha, A. Sanchez-Stern, J. R. Wilcox, and Z. Tatlock. Automatically improving accuracy for floating point expressions. In D. Grove and S. Blackburn, editors, Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, pages 1--11. ACM, 2015.
[32]
H. Patil, C. Pereira, M. Stallcup, G. Lueck, and J. Cownie. Pinplay: a framework for deterministic replay and reproducible analysis of parallel programs. In Proceedings of the CGO 2010, The 8th International Symposium on Code Generation and Optimization, pages 2--11, 2010.
[33]
M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE, special issue on "Program Generation, Optimization, and Adaptation", 93(2):232--275, 2005.
[34]
T. Ravitch. LLVM Whole-Program Wrapper. https://github.com/travitch/whole-program-llvm, Mar. 2011.
[35]
C. Rubio-González, C. Nguyen, H. D. Nguyen, J. Demmel, W. Kahan, K. Sen, D. H. Bailey, C. Iancu, and D. Hough. Precimonious: tuning assistant for floating-point precision. In SC'13, page 27, 2013.
[36]
A. Sampson, A. Baixo, B. Ransford, T. Moreau, J. Yip, L. Ceze, and M. Oskin. ACCEPT: A Programmer-Guided Compiler Framework for Practical Approximate Computing. 2015.
[37]
W. Saphir, R. V. D. Wijngaart, A. Woo, and M. Yarrow. New implementations and results for the nas parallel benchmarks 2. In In 8th SIAM Conference on Parallel Processing for Scientific Computing, 1997.
[38]
E. Schkufza, R. Sharma, and A. Aiken. Stochastic optimization of floating-point programs with tunable precision. In M. F. P. O'Boyle and K. Pingali, editors, ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '14, Edinburgh, United Kingdom - June 09-11, 2014, page 9. ACM, 2014.
[39]
K. Sen, S. Kalasapur, T. G. Brutch, and S. Gibbs. Jalangi: a selective record-replay and dynamic analysis framework for javascript. In ESEC/FSE'13, pages 488--498, 2013.
[40]
S. Sidiroglou-Douskos, S. Misailovic, H. Hoffmann, and M. C. Rinard. Managing performance vs. accuracy trade-offs with loop perforation. In T. Gyimóthy and A. Zeller, editors, SIGSOFT/FSE'11 19th ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE-19) and ESEC'11: 13rd European Software Engineering Conference (ESEC-13), Szeged, Hungary, September 5-9, 2011, pages 124--134. ACM, 2011.
[41]
A. Solovyev, C. Jacobsen, Z. Rakamaric, and G. Gopalakrishnan. Rigorous estimation of floating-point round-off errors with symbolic taylor expansions. In N. Bjørner and F. D. de Boer, editors, FM 2015: Formal Methods - 20th International Symposium, Oslo, Norway, June 24-26, 2015, Proceedings, volume 9109 of Lecture Notes in Computer Science, pages 532--550. Springer, 2015.
[42]
R. Vuduc, J. Demmel, and K. Yelick. OSKI: A library of automatically tuned sparse matrix kernels. In Proc. of SciDAC 2005, J. of Physics: Conference Series. Institute of Physics Publishing, June 2005.
[43]
C. Whaley. Automatically Tuned Linear Algebra Software (ATLAS). math-atlas.sourceforge.net, 2012.
[44]
I. Yamazaki, S. Tomov, T. Dong, and J. Dongarra. Mixed-precision orthogonalization scheme and adaptive step size for ca-gmres on gpus. Proceedings of 11th International Meeting High Performance Computing for Computational Science, VECPAR, 2014.
[45]
A. Zeller and R. Hildebrandt. Simplifying and isolating failure-inducing input. IEEE Trans. Software Eng., 28(2):183--200, 2002.
[46]
D. Zou, R. Wang, Y. Xiong, L. Zhang, Z. Su, and H. Mei. A genetic algorithm for detecting significant floating-point inaccuracies. In 37th IEEE/ACM International Conference on Software Engineering, ICSE 2015, Florence, Italy, May 16-24, 2015, Volume 1, pages 529--539. IEEE, 2015.

Cited By

View all
  • (2024)Arfa: An Agile Regime-Based Floating-Point Optimization Approach for Rounding ErrorsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680378(1516-1528)Online publication date: 11-Sep-2024
  • (2024)A Holistic Approach to Automatic Mixed-Precision Code Generation and Tuning for Affine ProgramsProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638484(55-67)Online publication date: 2-Mar-2024
  • (2024)Predicting Performance and Accuracy of Mixed-Precision Programs for Precision TuningProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623338(1-13)Online publication date: 20-May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '16: Proceedings of the 38th International Conference on Software Engineering
May 2016
1235 pages
ISBN:9781450339001
DOI:10.1145/2884781
© 2016 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 May 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. floating point
  2. mixed precision
  3. program optimization

Qualifiers

  • Research-article

Funding Sources

Conference

ICSE '16
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)163
  • Downloads (Last 6 weeks)36
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Arfa: An Agile Regime-Based Floating-Point Optimization Approach for Rounding ErrorsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680378(1516-1528)Online publication date: 11-Sep-2024
  • (2024)A Holistic Approach to Automatic Mixed-Precision Code Generation and Tuning for Affine ProgramsProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638484(55-67)Online publication date: 2-Mar-2024
  • (2024)Predicting Performance and Accuracy of Mixed-Precision Programs for Precision TuningProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623338(1-13)Online publication date: 20-May-2024
  • (2024)Interleaved Execution of Approximated CUDA Kernels in Iterative Applications2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP62718.2024.00017(60-67)Online publication date: 20-Mar-2024
  • (2024)Using reactive links to propagate changes across engineering modelsSoftware and Systems Modeling10.1007/s10270-024-01186-wOnline publication date: 10-Jun-2024
  • (2023)Using loop transformations for precision tuning in iterative programs2023 IEEE 30th Symposium on Computer Arithmetic (ARITH)10.1109/ARITH58626.2023.00031(159-166)Online publication date: 4-Sep-2023
  • (2022)moTunerProceedings of the 19th ACM International Conference on Computing Frontiers10.1145/3528416.3530231(94-102)Online publication date: 17-May-2022
  • (2022)Choosing mathematical function implementations for speed and accuracyProceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3519939.3523452(522-535)Online publication date: 9-Jun-2022
  • (2022)Constrained Precision Tuning2022 8th International Conference on Control, Decision and Information Technologies (CoDIT)10.1109/CoDIT55151.2022.9804011(230-236)Online publication date: 17-May-2022
  • (2021)Self-Adaptive Run-Time Variable Floating-Point Precision for Iterative Algorithms: A Joint HW/SW ApproachElectronics10.3390/electronics1018220910:18(2209)Online publication date: 9-Sep-2021
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media