Nothing Special   »   [go: up one dir, main page]

skip to main content
article
Free access

A data locality optimizing algorithm

Published: 01 May 1991 Publication History
First page of PDF

References

[1]
W. Abu-Sufah. Improving the Performance of Virtual Memory Computers. PhD thesis, University of Illinois at Urbana-Champaign, Nov 1978.
[2]
U. Banerjee. Data dependence in ordinary programs. Technical Report 76-837, University of Illinios at Urbana-Champaign, Nov 1976.
[3]
U. Banerjee. Dependence Analysis for Supercomputing. Kluwer Academic, 1988.
[4]
U. Banerjee. Unimodular transformations of double loops. In 3rd Workshop on Languages and Compilers for Parallel Computing, Aug 1990.
[5]
D. Callahan, S. Carr, and K. Kennedy. Improving register allocation for subscripted variables. In Proceedings of the ACM SIGPLAN '90 Conference on Programming Language Design and Implementation, June 1990.
[6]
J. Dongarra, J. Du Croz, S. Hammarling, and I. Duff. A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software, pages 1-17, March 1990.
[7]
K. Gallivan, W. Jalby, U. Meier, and A. Sameh. The impact Of hierarchical memory systems on linear algebra algorithm design. Technical report, University of Ulinios, 1987.
[8]
D. Oannon, W. Jalby, ancl K. Oallivan. Strategies for cache and local memory management by global program transformation. Journal of Parallel and Distributed Computing, 5:587-616, 1988.
[9]
G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins University Press, 1989.
[10]
F. Irigoin and R. Triolet. Computing dependence direction vectors and dependence cones. Technical Report E94, Centre D'Automatique et Informatique, 1988.
[11]
F. Irigoin and R. Triolet. Supemode partitioning. In Proc. 15th Annual ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, January 1988.
[12]
M. S. Lam, E. E. Rothberg, and M. E. Wolf. The cache performance and opfimizations of blocked algorithms. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, April 1991.
[13]
A. C. McKeller and E. G. Coffman. The organization of matrices and matrix operations in a paged multiprogramming environment. CACM, 12(3):153-165, 1969.
[14]
A. Porterfield. Software Methods for Improvement of Cache Performance on Supercomputer Applications. PhD thesis, Rice University, May 1989.
[15]
R. Schreiber and J. Dongarra. Automatic blocking of nested loops. 1990.
[16]
M. E. Wolf and M. S. Lam. A loop transformation theory and an algorithm to maximize parallelism. IEEE Transactions on Parallel and Distributed Systems, July 1991.
[17]
M. j. Wolfe. Techniques for improving the inherent parallelism in programs. Technical Report UIUCDCS-R-78-929, University of Illinois, 1978.
[18]
M. j. Wolfe. More iteration space tiling. In Supercomputing '89, Nov 1989.

Cited By

View all
  • (2024)Lightweight Deep Learning for Resource-Constrained Environments: A SurveyACM Computing Surveys10.1145/3657282Online publication date: 11-May-2024
  • (2023)Increasing FPGA Accelerators Memory Bandwidth With a Burst-Friendly Memory LayoutIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.320149442:5(1546-1559)Online publication date: 1-May-2023
  • (2023)Loop interchange and tiling for multi-dimensional loops to minimize write operations on NVMsJournal of Systems Architecture10.1016/j.sysarc.2022.102799135(102799)Online publication date: Feb-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 26, Issue 6
June 1991
352 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/113446
Issue’s Table of Contents
  • cover image ACM Conferences
    PLDI '91: Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
    May 1991
    356 pages
    ISBN:0897914287
    DOI:10.1145/113445
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 1991
Published in SIGPLAN Volume 26, Issue 6

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)469
  • Downloads (Last 6 weeks)69
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Lightweight Deep Learning for Resource-Constrained Environments: A SurveyACM Computing Surveys10.1145/3657282Online publication date: 11-May-2024
  • (2023)Increasing FPGA Accelerators Memory Bandwidth With a Burst-Friendly Memory LayoutIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.320149442:5(1546-1559)Online publication date: 1-May-2023
  • (2023)Loop interchange and tiling for multi-dimensional loops to minimize write operations on NVMsJournal of Systems Architecture10.1016/j.sysarc.2022.102799135(102799)Online publication date: Feb-2023
  • (2023)Combination of parallelization and skewed tilingProcedia Computer Science10.1016/j.procs.2023.12.024229(228-235)Online publication date: 2023
  • (2022)More is Less – Byte-quantized models are faster than bit-quantized models on the edge2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10020437(5632-5638)Online publication date: 17-Dec-2022
  • (2021)Intra- ­and Inter- Layer Transformation to Reduce Memory Traffic for CNN Computation50th International Conference on Parallel Processing Workshop10.1145/3458744.3473353(1-5)Online publication date: 9-Aug-2021
  • (2021)HCMSL: Hybrid Cross-modal Similarity Learning for Cross-modal RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/341284717:1s(1-22)Online publication date: 26-Apr-2021
  • (2021)Dynamic Graph Learning Convolutional Networks for Semi-supervised ClassificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/341284617:1s(1-13)Online publication date: 31-Mar-2021
  • (2021)TENETProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00062(720-733)Online publication date: 14-Jun-2021
  • (2021)Temporal blocking of finite-difference stencil operators with sparse “off-the-grid” sources2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS49936.2021.00058(497-506)Online publication date: May-2021
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media