Chilimbi et al., 1998 - Google Patents
Improving pointer-based codes through cache-conscious data placementChilimbi et al., 1998
View PDF- Document ID
- 10990173799991878551
- Author
- Chilimbi T
- Larus J
- Hill M
- Publication year
External Links
Snippet
Processor and memory technology trends show a continual increase in the cost of accessing main memory. Machine designers have tried to mitigate the eflect of this trend through hardware and software prefetching, multiple levels of cache, non~ blocking caches, dynamic …
- 230000015654 memory 0 abstract description 64
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/443—Optimisation
- G06F8/4441—Reducing the execution time required by the program code
- G06F8/4442—Reducing the number of cache misses; Data prefetching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
- G06F8/456—Parallelism detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3457—Performance evaluation by simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0844—Multiple simultaneous or quasi-simultaneous cache accessing
- G06F12/0846—Cache with multiple tag or data arrays being simultaneously accessible
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
- G06F12/023—Free address space management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/885—Monitoring specific for caches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chilimbi et al. | Cache-conscious structure layout | |
Ding et al. | Improving effective bandwidth through compiler enhancement of global cache reuse | |
Ghosh et al. | Cache miss equations: a compiler framework for analyzing and tuning memory behavior | |
Ghosh et al. | Precise miss analysis for program transformations with caches of arbitrary associativity | |
Mitchell et al. | Quantifying the multi-level nature of tiling interactions | |
Gloy et al. | Procedure placement using temporal-ordering information | |
Roth et al. | Effective jump-pointer prefetching for linked data structures | |
Im | Optimizing the performance of sparse matrix-vector multiplication | |
Carlisle | Olden: parallelizing programs with dynamic data structures on distributed-memory machines | |
Panda et al. | Local memory exploration and optimization in embedded systems | |
Li et al. | Compiler-assisted data distribution for chip multiprocessors | |
Chilimbi et al. | Improving pointer-based codes through cache-conscious data placement | |
Rao et al. | Intersectx: an efficient accelerator for graph mining | |
Cociorva et al. | Loop optimization for a class of memory-constrained computations | |
Weissman | Performance counters and state sharing annotations: a unified approach to thread locality | |
Kandemir et al. | Compilation techniques for out-of-core parallel computations | |
Boyd et al. | A hierarchical approach to modeling and improving the performance of scientific applications on the ksr1 | |
Kessler | Analysis of multi-megabyte secondary CPU cache memories | |
Yan et al. | Cacheminer: A runtime approach to exploit cache locality on SMP | |
Chilimbi | Cache-conscious data structures: design and implementation | |
Pingali et al. | Restructuring computations for temporal data cache locality | |
Trotter et al. | Cache simulation for irregular memory traffic on multi-core CPUs: Case study on performance models for sparse matrix–vector multiplication | |
Kodukula et al. | Data-centric transformations for locality enhancement | |
Venkataraman et al. | A blocked all-pairs shortest-paths algorithm | |
Moreira et al. | A standard Java array package for technical computing |