Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Fast 3D wavelet transform on multicore and many-core computing platforms

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The three-dimensional wavelet transform (3D-DWT) has focused the attention of the research community, most of all in areas such as video watermarking, compression of volumetric medical data, multispectral image coding, 3D model coding and video coding. In this work, we present several strategies to speed up the 3D-DWT computation through multicore processing. An in depth analysis of the available compiler optimizations is also presented. Depending on both the multicore platform and the GOP size, the developed parallel algorithm obtains efficiencies above 95 % using up to four cores (or processes), and above 83 % using up to 12 cores. Furthermore, the extra memory requirements is under 0.12 % for low resolution video frames, and under 0.017 % for high resolution video frames. In this work, we also present a CUDA-based algorithm to compute the 3D-DWT using the shared memory for the extra memory demands, obtaining speed-ups up to 12.68 on the many-core GTX280 platform. In areas such as video processing or ultra high definition image processing, the memory requirements can significantly degrade the developed algorithms, however, our algorithm increases the memory requirements in a negligible percentage, being able to perform a nearly in-place computation of the 3D-DWT whereas in other state-of-the-art 3D-DWT algorithms it is quite common to use a different memory space to store the computed wavelet coefficients doubling in this manner the memory requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Campisi P, Neri A (2005) Video watermarking in the 3D-DWT domain using perceptual masking. In: IEEE international conference on image processing, September 2005, pp 997–1000

    Google Scholar 

  2. Schelkens P, Munteanu A, Barbariend J, Galca M, Giro-Nieto X, Cornelis J (2003) Wavelet coding of volumetric medical datasets. IEEE Trans Med Imaging 22(3):441–458

    Article  Google Scholar 

  3. Dragotti PL, Poggi G (2000) Compression of multispectral images by three-dimensional SPITH algorithm. IEEE Trans Geosci Remote Sens 38(1):416–428

    Article  Google Scholar 

  4. Aviles M, Moran F, Garcia N (2005) Progressive lower trees of wavelet coefficients: efficient spatial and SNR scalable coding of 3D models. Lect Notes Comput Sci 3767:61–72

    Article  Google Scholar 

  5. Podilchuk CI, Jayant NS, Farvardin N (1995) Three dimensional subband coding of video. IEEE Trans Image Process 4(2):125–135

    Article  Google Scholar 

  6. Taubman D, Zakhor A (1994) Multirate 3-D subband coding of video. IEEE Trans Image Process 3(5):572–588

    Article  Google Scholar 

  7. Shapiro JM (1993) Embedded image coding using zerotrees of wavelet coefficients. IEEE Trans Signal Process 41(12):1–2

    Article  Google Scholar 

  8. Said A, Pearlman A (1996) A new, fast and efficient image codec based on set partitioning in hierarchical trees. IEEE Trans Circuits Syst Video Technol 6(3):243–250

    Article  Google Scholar 

  9. Oliver J, Malumbres MP (2006) Low-complexity multiresolution image compression using wavelet lower trees. IEEE Trans Circuits Syst Video Technol 16(11):1437–1444

    Article  Google Scholar 

  10. Chen Y, Pearlman WA (1996) Three-dimensional subband coding of video using the zero-tree method. In: Visual communications and image processing. Proc SPIE, vol 2727, pp 1302–1309

    Chapter  Google Scholar 

  11. Luo J, Wang X, Chen CW, Parker KJ (1996) Volumetric medical image compression with three-dimensional wavelet transform and octave zerotree coding. In: Visual communications and image processing. Proc SPIE, vol 2727, pp 579–590

    Chapter  Google Scholar 

  12. Kim BJ, Xiong Z, Pearlman WA (2000) Low bit-rate scalable video coding with 3D set partitioning in hierarchical trees (3D SPIHT). IEEE Trans Circuits Syst Video Technol 10:1374–1387

    Article  Google Scholar 

  13. Lopez O, Martinez-Rach M, Piñol P, Malumbres MP, Oliver J (2010) Low bit-rate video coding with 3D lower trees (3D-LTW). Lect Notes Comput Sci 6077:256–263

    Article  Google Scholar 

  14. Wong T-T, Leung C-S, Heng P-A, Wang J (2007) Discrete wavelet transform on consumer-level graphics hardware. IEEE Trans Multimed 9(3):668–673

    Article  Google Scholar 

  15. Tenllado C, Setoain J, Prieto M, Pinuel L, Tirado F (2008) Parallel implementation of the 2D discrete wavelet transform on graphics processing units: filter bank versus lifting. IEEE Trans Parallel Distrib Syst 19(3):299–310

    Article  Google Scholar 

  16. Franco J, Bernabé G, Fernández J, Acacio ME, Ujaldón M (2010) The GPU on the 2D wavelet transform. survey and contributions. In: Proceedings of para 2010: state of the art in scientific and parallel computing

    Google Scholar 

  17. Galiano V, López O, Malumbres MP, Migallón H (2011) Improving the discrete wavelet transform computation from multicore to gpu-based algorithms. In: Proceedings of international conference on computational and mathematical methods in science and engineering

    Google Scholar 

  18. Franco J, Bernabé G, Fernández J, Ujaldón M (2010) Parallel 3D fast wavelet transform on manycore gpus and multicore cpus. Proc Comput Sci 1(1):1101–1110

    Article  Google Scholar 

  19. Mallat SG (1989) A theory for multi-resolution signal decomposition: the wavelet representation. IEEE Trans Pattern Anal Mach Intell 11(7):674–693

    Article  MATH  Google Scholar 

  20. OpenMP application program interface, version 3.1. OpenMP Architecture Review Board (2011). http://www.openmp.org

  21. ICC, intel software network. http://software.intel.com/en-us/intel-compilers/, 2009–2011

  22. GCC, the GNU compiler collection. Free Software Foundation, Inc 2009–2012 http://gcc.gnu.org

  23. Nickolls J, Buck I, Garland M, Skadron K (2008) Scalable parallel programming with cuda. In: Queue, vol 6, pp 40–53

    Google Scholar 

  24. NVIDIA Corporation. Nvidia CUDA C programming guide. version 3.2

Download references

Acknowledgements

This research was partially supported by the Spanish Ministry of Education and Science under grant DPI2007-66796-C03-03 and the Spanish Ministry of Science and Innovation under grant number TIN2008-06570-C04-04.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to H. Migallón.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Galiano, V., López-Granado, O., Malumbres, M.P. et al. Fast 3D wavelet transform on multicore and many-core computing platforms. J Supercomput 65, 848–865 (2013). https://doi.org/10.1007/s11227-013-0868-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-013-0868-0

Keywords

Navigation