Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1542431.1542446acmconferencesArticle/Chapter ViewAbstractPublication PagesismmConference Proceedingsconference-collections
research-article

A component model of spatial locality

Published: 19 June 2009 Publication History

Abstract

Good spatial locality alleviates both the latency and bandwidth problem of memory by boosting the effect of prefetching and improving the utilization of cache. However, conventional definitions of spatial locality are inadequate for a programmer to precisely quantify the quality of a program, to identify causes of poor locality, and to estimate the potential by which spatial locality can be improved.
This paper describes a new, component-based model for spatial locality. It is based on measuring the change of reuse distances as a function of the data-block size. It divides spatial locality into components at program and behavior levels. While the base model is costly because it requires the tracking of the locality of every memory access, the overhead can be reduced by using small inputs and by extending a sampling-based tool. The paper presents the result of the analysis for a large set of benchmarks, the cost of the analysis, and the experience of a user study, in which the analysis helped to locate a data-layout problem and improve performance by 7% with a 6-line change in an application with over 2,000 lines.

References

[1]
G. Ammons, T. Ball, and J. R. Larus. Exploiting hardware performance counters with flow and context sensitive profiling. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 85--96, 1997.
[2]
M. Arnold and B. G. Ryder. A framework for reducing the cost of instrumented code. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Snowbird, Utah, June 2001.
[3]
E. Berg and E. Hagersten. Fast data-locality profiling of native execution. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems, pages 169--180, 2005.
[4]
K. Beyls and E. D'Hollander. Discovery of locality-improving refactoring by reuse path analysis. In Proceedings of HPCC. Springer. Lecture Notes in Computer Science Vol. 4208, pages 220--229, 2006.
[5]
R. B. Bunt and J. M. Murphy. Measurement of locality and the behaviour of programs. The Computer Journal, 27(3):238--245, 1984.
[6]
B. Calder, C. Krintz, S. John, and T. Austin. Cache-conscious data placement. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VIII), San Jose, Oct 1998.
[7]
C. Cascaval, E. Duesterwald, P. F. Sweeney, and R. W. Wisniewski. Multiple page size modeling and optimization. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, St. Louis, MO, 2005.
[8]
T. M. Chilimbi. Efficient representations and abstractions for quantifying and exploiting data reference locality. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Snowbird, Utah, June 2001.
[9]
C. Ding and Y. Zhong. Predicting whole-program locality with reuse distance analysis. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, San Diego, CA, June 2003.
[10]
C. Fang, S. Carr, S. Onder, and Z. Wang. Instruction based memory distance analysis and its application to optimization. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, St. Louis, MO, 2005.
[11]
H. Han and C.-W. Tseng. Exploiting locality for irregular scientific codes. IEEE Transactions on Parallel and Distributed Systems, 17(7):606--618, 2006.
[12]
M. Hirzel and T. M. Chilimbi. Bursty tracing: A framework for low-overhead temporal profiling. In Proceedings of ACM Workshop on Feedback-Directed and Dynamic Optimization, Dallas, Texas, 2001.
[13]
K. Kelsey, T. Bai, and C. Ding. Fast track: a software system for speculative optimization. In Proceedings of the International Symposium on Code Generation and Optimization, 2009.
[14]
K.-H. Li. Reservoir-sampling algorithms of time complexity o(n(1+log(n/n))). ACM Transactions on Mathematical Software, 20(4):481--493, December 1994.
[15]
G. Marin and J. Mellor-Crummey. Cross architecture performance predictions for scientific applications using parameterized models. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems, New York City, NY, June 2004.
[16]
G. Marin and J. Mellor-Crummey. Scalable cross-architecture predictions of memory hierarchy response for scientific applications. In Proceedings of the Symposium of the Las Alamos Computer Science Institute, Sante Fe, New Mexico, 2005.
[17]
R. L. Mattson, J. Gecsei, D. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBM System Journal, 9(2):78--117, 1970.
[18]
K. S. McKinley, S. Carr, and C.-W. Tseng. Improving data locality with loop transformations. ACM Transactions on Programming Languages and Systems, 18(4):424--453, July 1996.
[19]
R. C. Murphy and P. M. Kogge. On the memory access patterns of supercomputer applications: Benchmark selection and its implications. IEEE Transactions on Computers, 56(7):937--945, 2007.
[20]
N. Nethercote and J. Seward. Valgrind: a framework for heavyweight dynamic binary instrumentation. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 89--100, 2007.
[21]
E. Petrank and D. Rawitz. The hardness of cache conscious data placement. In Proceedings of ACM Symposium on Principles of Programming Languages, Portland, Oregon, January 2002.
[22]
M. L. Seidl and B. G. Zorn. Segregating heap objects by reference behavior and lifetime. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VIII), San Jose, Oct 1998.
[23]
X. Shen, Y. Gao, C. Ding, and R. Archambault. Lightweight reference affinity analysis. In Proceedings of the 19th ACM International Conference on Supercomputing, pages 131--140, Cambridge, MA, June 2005.
[24]
X. Shen and J. Shaw. Scalable implementation of efficient locality approximation. In J. N. Amaral, editor, Proceedings of the Workshop on Languages and Compilers for Parallel Computing, pages 202--216, 2008.
[25]
X. Shen, J. Shaw, B. Meeker, and C. Ding. Locality approximation using time. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 55--61, 2007.
[26]
X. Shen, Y. Zhong, and C. Ding. Regression-based multi-model prediction of data reuse signature. In Proceedings of the 4th Annual Symposium of the Las Alamos Computer Science Institute, Sante Fe, New Mexico, November 2003.
[27]
A. J. Smith. On the effectiveness of set associative page mapping and its applications in main memory management. In Proceedings of the 2nd International Conference on Software Engineering, 1976.
[28]
Spec cpu benchmarks. http://www.spec.org/benchmarks.html\#cpu.
[29]
M. M. Strout, L. Carter, and J. Ferrante. Compile-time composition of run-time data and iteration reorderings. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 245--257, San Diego, CA, June 2003.
[30]
J. Weinberg, M. O. McCracken, E. Strohmaier, and A. Snavely. Quantifying locality in the memory access patterns of hpc applications. In Proceedings of Supercomputing, 2005.
[31]
B. S. White, S. A. McKee, B. R. de Supinski, B. Miller, D. Quinlan, and M. Schulz. Improving the computational intensity of unstructured mesh applications. In Proceedings of the 19th ACM International Conference on Supercomputing, pages 341--350, Cambridge, MA, June 2005.
[32]
C. Zhang, C. Ding, M. Ogihara, Y. Zhong, and Y. Wu. A hierarchical model of data locality. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Charleston, SC, January 2006.
[33]
H. Zhang and D. Gildea. Stochastic lexicalized inversion transduction grammar for alignment. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pages 475--482, 2005.
[34]
Y. Zhong, S. G. Dropsho, X. Shen, A. Studer, and C. Ding. Miss rate prediction across program inputs and cache configurations. IEEE Transactions on Computers, 56(3):328--343, March 2007.
[35]
Y. Zhong, M. Orlovich, X. Shen, and C. Ding. Array regrouping and structure splitting using whole-program reference affinity. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2004.

Cited By

View all
  • (2021)DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement BottlenecksIEEE Access10.1109/ACCESS.2021.31109939(134457-134502)Online publication date: 2021
  • (2021)A Study on Modeling and Optimization of Memory SystemsJournal of Computer Science and Technology10.1007/s11390-021-0771-836:1(71-89)Online publication date: 30-Jan-2021
  • (2019)A Relational Theory of LocalityACM Transactions on Architecture and Code Optimization10.1145/334110916:3(1-26)Online publication date: 20-Aug-2019
  • Show More Cited By

Index Terms

  1. A component model of spatial locality

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ISMM '09: Proceedings of the 2009 international symposium on Memory management
      June 2009
      158 pages
      ISBN:9781605583471
      DOI:10.1145/1542431
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 19 June 2009

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. reuse distance
      2. spatial locality

      Qualifiers

      • Research-article

      Conference

      ISMM '09
      Sponsor:

      Acceptance Rates

      ISMM '09 Paper Acceptance Rate 15 of 32 submissions, 47%;
      Overall Acceptance Rate 72 of 156 submissions, 46%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)10
      • Downloads (Last 6 weeks)6
      Reflects downloads up to 13 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2021)DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement BottlenecksIEEE Access10.1109/ACCESS.2021.31109939(134457-134502)Online publication date: 2021
      • (2021)A Study on Modeling and Optimization of Memory SystemsJournal of Computer Science and Technology10.1007/s11390-021-0771-836:1(71-89)Online publication date: 30-Jan-2021
      • (2019)A Relational Theory of LocalityACM Transactions on Architecture and Code Optimization10.1145/334110916:3(1-26)Online publication date: 20-Aug-2019
      • (2019)Memory and Parallelism Analysis Using a Platform-Independent ApproachProceedings of the 22nd International Workshop on Software and Compilers for Embedded Systems10.1145/3323439.3323988(23-26)Online publication date: 27-May-2019
      • (2019)Multi-spectral Reuse Distance: Divining Spatial Information from Temporal Data2019 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC.2019.8916398(1-8)Online publication date: Sep-2019
      • (2019)Platform Independent Software Analysis for Near Memory Computing2019 22nd Euromicro Conference on Digital System Design (DSD)10.1109/DSD.2019.00093(606-609)Online publication date: Aug-2019
      • (2019)Near-memory computingMicroprocessors & Microsystems10.1016/j.micpro.2019.10286871:COnline publication date: 1-Nov-2019
      • (2018)CaL: Extending Data Locality to Consider Concurrency for Performance OptimizationIEEE Transactions on Big Data10.1109/TBDATA.2017.27538254:2(273-288)Online publication date: 1-Jun-2018
      • (2015)Why Does Data Prefetching Not Work for Modern Workloads?The Computer Journal10.1093/comjnl/bxv11259:2(244-259)Online publication date: 23-Dec-2015
      • (2013)Microarchitectural design space exploration made fastMicroprocessors & Microsystems10.1016/j.micpro.2012.07.00637:1(41-51)Online publication date: 1-Feb-2013
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media