Nothing Special   »   [go: up one dir, main page]

Skip to main content

Sums of Weighted Bits

  • Chapter
  • First Online:
Application-Specific Arithmetic

Abstract

A large class of composite fixed-point computations can be expressed as a global sum of weighted bits. This point of view has many advantages, the main one being that the associativity of the sum can be exploited at the bit level to build efficient architectures, called compressor trees, that perform such computations. This chapter defines a data structure, the bit heap, that captures sums of weighted bits. It then discusses the construction of application-specific bit heaps. Finally, it studies the implementation of the corresponding compressor trees.

I don’t know what technology will look like 20 years from now, but I know one thing: people will still need to compute additions and multiplications.Jean-Michel Muller

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The fanout is the number of gate inputs connected to a single output.

  2. 2.

    This simplifying hypothesis is true for the multiplier, but not, for example, when using a divider by 3 to compute some bits in (7.7) and an AND gate to compute some others. In our presentation we will ignore such considerations for simplicity, but they can easily be managed in the method presented here.

  3. 3.

    This bound is attained in the two examples taken in this chapter, but we will show in the sequel why it may be pessimistic in some cases.

  4. 4.

    If it is possible to manage such correlations by a rewriting that reduces the bit heap size, it is of course better.

  5. 5.

    If input bits arrive late or in a different cycle [Bru+13], this can be easily extended to setting Ns,c for other stages as well.

References

  1. Tetrahedral (or triangular pyramidal) numbers. On-Line Encyclopedia of Integer Sequences. http://oeis.org/A000292. 2010

  2. A000337. On-Line Encyclopedia of Integer Sequences. http://oeis.org/A000337. 2010

  3. Anton Blad and Oscar Gustafsson. “Integer Linear Programming-Based Bit-Level Optimization for High-Speed FIR Decimation Filter Architectures”. In: Circuits, Systems, and Signal Processing 29.1 (2009), pp. 81–101

    Google Scholar 

  4. Nicolas Brunie, Florent de Dinechin, Matei Iştoan, Guillaume Sergent, Kinga Illyes, and Bogdan Popa. “Arithmetic Core Generation Using Bit Heaps”. In: Field Programmable Logic and Application (FPL). IEEE, 2013, pp. 1–8

    Google Scholar 

  5. K’Andrea C. Bickerstaff, Earl E. Swartzlander, and Michael J. Schulte. “Reduced Area Multipliers”. In: Application-Specific Array Processors. 1993, pp. 478–489

    Google Scholar 

  6. Charles R. Baugh and Bruce A.Wooley. “A Two’s Complement Parallel Array Multiplication Algorithm”. In: IEEE Transactions on Computers C-22.12 (1973), pp. 1045–1047

    Google Scholar 

  7. Chip-Hong Chang, Jiangmin Gu, and Mingyan Zhang. “Ultra Low-Voltage Low-Power CMOS 4-2 and 5-2 Compressors for Fast Arithmetic Circuits”. In: IEEE Transactions on Circuits and Systems I: Regular Papers 51.10 (2004), pp. 1985–1997

    Google Scholar 

  8. Chichyang Chen. “High-order Taylor series approximation for efficient computation of elementary functions”. In: IET Computers & Digital Techniques 9.6 (2015), pp. 328–335

    Google Scholar 

  9. Luigi Dadda. “Some Schemes For Parallel Multipliers”. In: Alta Frequenza 45.5 (1965), pp. 349–356

    Google Scholar 

  10. Davide De Caro, Ettore Napoli, Darjn Esposito, Gerardo Castellano, Nicola Petra, and Antonio GM Strollo. “Minimizing Coefficients Wordlength for Piecewise-Polynomial Hardware Function Evaluation With Exact or Faithful Rounding”. In: IEEE Transactions on Circuits and Systems I: Regular Papers 64.5 (2017), pp. 1187–1200

    Google Scholar 

  11. Florent de Dinechin, Matei Iştoan, and Guillaume Sergent. “Fixed-Point Trigonometric Functions on FPGAs”. In: SIGARCH Computer Architecture News 41.5 (2013), pp. 83–88

    Google Scholar 

  12. Theo A. Drane, Thomas M. Rose, and George A. Constantinides. “On the Systematic Creation of Faithfully Rounded Truncated Multipliers and Arrays”. In: IEEE Transactions on Computers 63.10 (2014), pp. 2513–2525

    Google Scholar 

  13. Miloš D. Ercegovac and Tomás Lang. Digital Arithmetic. Morgan Kaufmann, 2004

    Google Scholar 

  14. Christopher Fritz and Adly T. Fam. “Fast Binary Counters Based on Symmetric Stacking”. In: Transactions on Very Large Scale Integration (VLSI) Systems 25.10 (2017), pp. 2971–2975

    Google Scholar 

  15. K. A. Feiste and Earl E. Swartzlander. “Merged Arithmetic Revisited”. In: Workshop on Signal Processing Systems (SiPS). IEEE. 1997, pp. 212–221

    Google Scholar 

  16. Ghassem Jaberipur, Behrooz Parhami, and Mohammad Ghodsi. “An Efficient Universal Addition Scheme for All Hybrid-Redundant Representations withWeighted Bit-Set Encoding”. In: Journal of VLSI Signal Processing 42 (2006), pp. 149–158

    Google Scholar 

  17. Robert F. Jones and Earl E. Swartzlander. “Parallel Counter Implementation”. In: Journal of VLSI signal processing systems for signal, image and video technology 7.3 (1994), pp. 223–232

    Google Scholar 

  18. Hou-Jen Ko and Shen-Fu Hsiao. “Design and Application of Faithfully Rounded and Truncated Multipliers With Combined Deletion, Reduction, Truncation, and Rounding”. In: Transactions on Circuits and Systems II: Express Briefs 58.5 (2011), pp. 304–308

    Google Scholar 

  19. Martin Kumm and Johannes Kappauf. “Advanced Compressor Tree Synthesis for FPGAs”. In: IEEE Transactions on Computers 67.8 (2018), pp. 1078–1091

    Google Scholar 

  20. Eric J. King and Earl E. Swartzlander. “Data-Dependent Truncation Scheme for Parallel Multipliers”. In: Asilomar Conference on Signals, Circuits and Systems. Vol. 2. IEEE, 1997, pp. 1178–1182

    Google Scholar 

  21. Martin Kumm and Peter Zipf. “Efficient High Speed Compression Trees on Xilinx FPGAs”. In: Methoden und Beschreibungssprachen zur Modellierung und Verifikation von Schaltungen und Systemen (MBMV). 2014, pp. 171–182

    Google Scholar 

  22. Martin Kumm and Peter Zipf. “Pipelined Compressor Tree Optimization Using Integer Linear Programming”. In: International Conference on Field Programmable Logic and Application (FPL). IEEE, 2014, pp. 1–8

    Google Scholar 

  23. Patrik Larsson and Chris J. Nicol. “Transition Reduction in Carry-Save Adder Trees”. In: International Symposium on Low Power Electronics and Design. IEEE, 1996, pp. 85–88

    Google Scholar 

  24. Angelo R. Meo. “Arithmetic Networks and Their Minimization Using a New Line of Elementary Units”. In: IEEE Transactions on Computers C-24.3 (1975), pp. 258–280

    Google Scholar 

  25. Taeko Matsunaga, Shinji Kimura, and Yusuke Matsunaga. “Power and Delay Aware Synthesis of Multi-Operand Adders Targeting LUT-Based FPGAs”. In: International Symposium on Low Power Electronics and Design (ISLPED). 2011, pp. 217–222

    Google Scholar 

  26. Taeko Matsunaga, Shinji Kimura, and Yusuke Matsunaga. “An Exact Approach for GPC-Based Compressor Tree Synthesis”. In: IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E96-A.12 (2013), pp. 2553–2560

    Google Scholar 

  27. Mayur Mehta, Vijay Parmar, and Earl E. Swartzlander. “High-Speed Multiplier Design Using Multi-Input Counter and Compressor Circuits”. In: Symposium on Computer Arithmetic (ARITH). IEEE, 1991, pp. 43–50

    Google Scholar 

  28. Romain Michard, Arnaud Tisserand, and Nicolas Veyrat- Charvillon. “Carry Prediction and Selection for Truncated Multiplication”. In: Workshop on Signal Processing Systems (SiPS). IEEE. 2006, pp. 339–344

    Google Scholar 

  29. Hadi Parandeh-Afshar, Arkosnato Neogy, Philip Brisk, and Paolo Ienne. “Compressor Tree Synthesis on Commercial High-Performance FPGAs”. In: ACM Transactions on Reconfigurable Technology and Systems (TRETS) 4.4 (2011), pp. 1–19

    Google Scholar 

  30. Behrooz Parhami. Computer Arithmetic, Algorithms and Hardware Designs. 2nd ed. Oxford University Press, 2010

    Google Scholar 

  31. Hadi Parandeh-Afshar, Philip Brisk, and Paolo Ienne. “Efficient Synthesis of Compressor Trees on FPGAs”. In: Asia and South Pacific Design Automation Conference (ASPDAC). IEEE, 2008, pp. 138–143

    Google Scholar 

  32. Hadi Parandeh-Afshar, Philip Brisk, and Paolo Ienne. “Improving Synthesis of Compressor Trees on FPGAs via Integer Linear Programming”. In: Design, Automation and Test in Europe (DATE). IEEE, 2008, pp. 1256–1261

    Google Scholar 

  33. Nicola Petra, Davide De Caro, Valeria Garofalo, Ettore Napoli, and Antonio G.M. Strollo. “Truncated Binary MultipliersWith Variable Correction and Minimum Mean Square Error”. In: Transactions on Circuits and Systems I: Regular Papers 57.6 (2010), pp. 1312–1325

    Google Scholar 

  34. Karuna Prasad and Keshab K. Parhi. “Low-Power 4-2 and 5-2 Compressors”. In: Asilomar Conference on Signals, Circuits and Systems. IEEE, 2001, 129–133 vol.1

    Google Scholar 

  35. Thomas B. Preußer. “Generic and Universal Parallel Matrix Summation with a Flexible Compression Goal for Xilinx FPGAs”. In: International Conference on Field-Programmable Logic and Applications (FPL). IEEE, 2017, pp. 1–7

    Google Scholar 

  36. Antonio G.M. Strollo, Davide De Caro, and Nicola Petra. “Elementary Functions Hardware Implementation Using Constrained Piecewise-Polynomial Approximations”. In: IEEE Transactions on Computers 60.3 (2011), pp. 418–432

    Google Scholar 

  37. W. J. Stenzel, W. J. Kubitz, and G. H. Garcia. “A Compact High- Speed Parallel Multiplication Scheme”. In: IEEE Transactions on Computers C-26.10 (1977), pp. 948–957

    Google Scholar 

  38. Paul F. Stelling and Vojin G. Oklobdzija. “Design Strategies for Optimal Hybrid Final Adders in a Parallel Multiplier”. In: Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology 14.3 (1996), pp. 321–331

    Google Scholar 

  39. Michael J. Schulte and Earl E. Swartzlander. “Truncated Multiplication with Correction Constant”. In: IEEE Workshop on VLSI Signal Processing. 1993, pp. 388–396

    Google Scholar 

  40. Earl E. Swartzlander. “A Review of Large Parallel Counter Designs”. In: Annual Symposium on VLSI. IEEE, 2004, pp. 89–98

    Google Scholar 

  41. Earl E. Swartzlander. “Parallel Counters”. In: IEEE Transactions on Computers C-22.11 (1973), pp. 1021–1024

    Google Scholar 

  42. Earl E. Swartzlander. “Merged Arithmetic”. In: IEEE Transactions on Computers C-29.10 (1980), pp. 946–950

    Google Scholar 

  43. Ajay K. Verma, Philip Brisk, and Paolo Ienne. “Data-Flow Transformations to Maximize the Use of Carry-Save Representation in Arithmetic Circuits”. In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 27.10 (2008), pp. 1761–1774

    Google Scholar 

  44. Sreehari Veeramachaneni, Avinash Lingamneni, M. Kirthi Krishna, and M. B. Srinivas. “Novel Architectures for Efficient (m, n) Parallel Counters”. In: 17th ACM Great Lakes symposium on VLSI. 2007

    Google Scholar 

  45. Anastasia Volkova, Matei Iştoan, Florent de Dinechin, and Thibault Hilaire. “Towards Hardware IIR Filters Computing Just Right: Direct Form I Case Study”. In: IEEE Transactions on Computers 68.4 (2019)

    Google Scholar 

  46. A. Weinberger. “4: 2 Carry-Save Adder Module”. In: IBM Technical Disclosure Bulletin 23.8 (1981), pp. 3811–3814

    Google Scholar 

  47. Yuelai Yuan, Le Tu, Kan Huang, Xiaoqiang Zhang, Tiejun Zhang, Dihu Chen, and Zixin Wang. “Area Optimized Synthesis of Compressor Trees on Xilinx FPGAs Using Generalized Parallel Counters”. In: IEEE Access 7 (2019), pp. 134815–134827

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2024 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

de Dinechin, F., Kumm, M. (2024). Sums of Weighted Bits. In: Application-Specific Arithmetic. Springer, Cham. https://doi.org/10.1007/978-3-031-42808-1_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-42808-1_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-42807-4

  • Online ISBN: 978-3-031-42808-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics