An approximate method for filtering out data dependencies with a sufficiently large distance between memory references

Patricio Bulić¹ &
Tomaž Dobravec¹

63 Accesses
1 Citation
Explore all metrics

Abstract

In this paper we present an approximate algorithm for detecting and filtering data dependencies with a sufficiently large distance between memory references. A sequence of the same operations (typically enclosed in a ‘for’ loop) can be replaced with a single SIMD operation if the distance between memory references is greater than or equal to the number of data processed in the SIMD register. Some loops that could not be vectorized on traditional vector processors, can still be parallelized for short SIMD execution. There are a number of approximate data-dependence tests that have been proposed in the literature but in all of them data dependency will be assumed when actually there is no such a dependence that could restrict parallelization related to the short SIMD execution model. By examining the properties of linear subscript expressions of possibly conflicting data references, our algorithm gives the green light to the parallelization process if some sufficient conditions regarding the dependence distance are met. Our method is based on the Banerjee test and checks the minimum and maximum distances between memory references within the iteration space rather than searching for the existence of an integer solution to the dependence equation. The proposed method extends the accuracy and applicability of the classical Banerjee test.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Approximate Data Dependence Profiling Based on Abstract Interval and Congruent Domains

Detecting SIMDization Opportunities through Static/Dynamic Dependence Analysis

An Analytical Evaluation of Data Dependence Analysis Techniques

Article 18 May 2018

References

Aliaga JI et al (2009) Toward the parallelization of GSL. J Supercomput 48:88–114. doi:10.1007/s11227-008-0207-z
Article Google Scholar
Allen R, Kennedy K (2001) Optimizing compilers for modern architectures: a dependence-based approach. Morgan Kaufmann, San Mateo
Google Scholar
Arabnia HR, Oliver MA (1987) A transputer network for the arbitrary rotation of digitized images. Comput J 30(5):425–432
Google Scholar
Banerjee U, Eigenman R, Nicolau A, Padua DA (1993) Automatic program parallelization. Proc IEEE 81(2):211–243
Article Google Scholar
Banerjee U (1997) Dependence analysis: a book series on loop transformations for restructuring compilers. Kluwer Academic, Dordrecht
MATH Google Scholar
Bhandarkar SM, Arabnia HR (1995) The Hough transform on a reconfigurable multi-ring network. J Parallel Distrib Comput 24(1):107–114
Article Google Scholar
Bhandarkar SM, Arabnia HR, Smith JW (1995) A reconfigurable architecture for image-processing and computer vision. Int J Pattern Recognit Artif Intell 9(2):201–229
Article Google Scholar
Bik AJC, Girkar M, Grey PM, Tian XM (2002) Automatic intra-register vectorization for the Intel (R) architecture. Int J Parallel Program 30(2):65–98
Article MATH Google Scholar
Blume W, Eigenman R, Hoeflinger J, Padua DA, Petersen P, Rauchwerger L, Tu P (1994) Automatic detection of parallelism: a great challenge for high-performance computing. IEEE Parallel Distrib Process Technol 2(3):37–47
Article Google Scholar
Blume W, Eigenman R (1998) Nonlinear and symbolic data dependence testing. IEEE Parallel Distrib Process Technol 9(12):1180–1194
Article Google Scholar
Chuang YJ, Wu JL (2005) An efficient matrix-based 2-D DCT splitter and merger for SIMD instructions. IEICE Trans Inf Syst E88D(7):1569–1577. doi:0.1093/ietisy/e88-d.7.1569
Article Google Scholar
Di Blas A, Jagota A, Hughey R (2005) Optimizing neural networks on SIMD parallel computers. Parallel Comput 31(1):97–115. doi:10.1016/j.parco.2004.11.002
Article Google Scholar
Guo M et al (2009) Communication-free data alignment for arrays with exponential references in parallelizing compilers for scalable parallel systems. J Supercomput. doi:10.1007/s11227-009-0280-y
Google Scholar
Hassaballah M et al (2008) A review of SIMD multimedia extensions and their usage in scientific and engineering applications. Comput J 51(6):630–649. doi:10.1093/comjnl/bxm099
Article Google Scholar
Huang TC, Yang CM (2000) Data dependence analysis for array references. J Syst Softw 52:55–65
Article Google Scholar
Kim S et al (2009) Region-based parallelization of irregular reductions on explicitly managed memory hierarchies. J Supercomput. doi:10.1007/s11227-009-0340-3
Google Scholar
Krall A, Lelait S (2000) Compilation techniques for multimedia processors. Int J Parallel Program 28(4):347–361
Article Google Scholar
Kurzak J et al (2009) Optimizing matrix multiplication for a short-vector SIMD architecture—CELL processor. Parallel Comput 35(3):138–150. doi:10.1016/j.parco.2008.12.010
Article MathSciNet Google Scholar
Lee WC et al (2008) Algorithm and software optimization of variable block size motion estimation for H.264/AVC on a VLIW-SIMD DSP. J Signal Process Syst Signal Image Video Technol 51(3):289–302. doi:10.1007/s11265-007-0151-9
Article Google Scholar
Lotfi S, Parsa S (2009) Parallel loop generation and scheduling. J Supercomput doi:10.1007/s11227-008-0262-5
Google Scholar
Paver NC, Khan MH, Aldrich BC (2006) Optimizing mobile multimedia using SIMD techniques. Multimed Tools Appl 28(2):393–416
Article Google Scholar
Petersen PM, Padua DA (1996) Static and dynamic evaluation of data dependence analysis techniques. IEEE Trans Parallel Distrib Syst 7(11):1121–1132
Article Google Scholar
Pough W (1992) A practical algorithm for exact array dependence analysis. Commun ACM 35(8):102–114
Article Google Scholar
Pough W (1992) Definitions of dependence distance. ACM Lett Program Lang Syst 1(3):261–265
Article Google Scholar
Psarris K (2002) Program analysis techniques for transforming programs for parallel execution. Parallel Comput 28(3):455–469
Article MATH Google Scholar
Psarris K, Klappholz D, Kong X (2004) An experimental evaluation of data dependence analysis techniques. IEEE Trans Parallel Distrib Syst 15(3):196–213
Article Google Scholar
Shahbahrami A et al (2008) Implementing the 2-D wavelet transform on SIMD-enhanced general-purpose processors. IEEE Trans Multimed 10(1):43–51. doi:10.1109/TMM.2007.911195
Article Google Scholar
Sreraman N, Govindarajan R (2000) A vectorizing compiler for multimedia extensions. Int J Parallel Program 28(4):363–400
Article Google Scholar
Wani MA, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable MultiRing network. J Supercomput 25(1):43–62
Article MATH Google Scholar
Wolfe MJ, Banerjee U (1987) Data dependence and its application to parallel processing. Int J Parallel Program 16(2):137–178
Article MathSciNet MATH Google Scholar
Wolfe MJ, Tseng CW (1992) The power test for data dependence. IEEE Trans Parallel Distrib Syst 3(5):591–601
Article Google Scholar
Xue J, Guo M, Wei D (2008) Improving the parallelism of iterative methods by aggressive loop fusion. J Supercomput 43:147–164. doi:10.1007/s11227-007-0124-6
Article Google Scholar
Yang CT, Lai KC (2009) A directive-based MPI code generator for Linux PC clusters. J Supercomput. doi:10.1007/s11227-008-0258-1
Google Scholar
Yang X et al (2009) Matrix-based streamization approach for improving locality and parallelism on FT64 stream processor. J Supercomput 47:171–197. doi:10.1007/s11227-008-0186-0
Article Google Scholar
Yuan FN et al (2006) An interactive 3D visualization system based on PC using Intel SIMD, 3D texturing and thinning techniques. Int J Pattern Recognit Artif Intell 20(3):393–416
Article Google Scholar
Zhou J, Zeng G (2008) A general data dependence analysis for parallelizing compilers. J Supercomput 45:236–252. doi:10.1007/s11227-007-0168-7
Article Google Scholar
Zhou J, Zeng G (2008) A general data dependence analysis for parallelizing compilers. ERRATUM. J Supercomput 45:253. doi:10.1007/s11227-008-0197-x
Article Google Scholar
Zhou JP, Shi C (2008) Efficient SIMD optimization for media processors. J Zhejiang Univ-Sci A 9(4):524–530. doi:10.1631/jzus.A071203
Article MATH Google Scholar
Zima HP, Chapman BM (1990) Supercompilers for parallel and vector computers. Addison-Wesley, Reading
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer and Information Science, University of Ljubljana, Tržaška c. 25, 1000, Ljubljana, Slovenia
Patricio Bulić & Tomaž Dobravec

Authors

Patricio Bulić
View author publications
You can also search for this author in PubMed Google Scholar
Tomaž Dobravec
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Patricio Bulić.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bulić, P., Dobravec, T. An approximate method for filtering out data dependencies with a sufficiently large distance between memory references. J Supercomput 56, 226–244 (2011). https://doi.org/10.1007/s11227-009-0364-8

Download citation

Published: 27 November 2009
Issue Date: May 2011
DOI: https://doi.org/10.1007/s11227-009-0364-8

An approximate method for filtering out data dependencies with a sufficiently large distance between memory references

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Approximate Data Dependence Profiling Based on Abstract Interval and Congruent Domains

Detecting SIMDization Opportunities through Static/Dynamic Dependence Analysis

An Analytical Evaluation of Data Dependence Analysis Techniques

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Subscribe and save

Buy Now

Navigation

An approximate method for filtering out data dependencies with a sufficiently large distance between memory references

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Approximate Data Dependence Profiling Based on Abstract Interval and Congruent Domains

Detecting SIMDization Opportunities through Static/Dynamic Dependence Analysis

An Analytical Evaluation of Data Dependence Analysis Techniques

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now

Search

Navigation