Google Scholar

Principal kernel analysis: A tractable methodology to simulate scaled GPU workloads

C Avalos Baddouh, M Khairy, RN Green… - MICRO-54: 54th Annual …, 2021 - dl.acm.org

Simulating all threads in a scaled GPU workload results in prohibitive simulation cost. Cycle-level
simulation is orders of magnitude slower than native silicon, the only solution is to …

Save Cite Cited by 30 Related articles All 11 versions

Forecasting GPU Performance for Deep Learning Training and Inference

S Lee, A Phanishayee, D Mahajan - Proceedings of the 30th ACM …, 2025 - dl.acm.org

Deep learning kernels exhibit a high level of predictable memory accesses and compute
patterns, making GPU's architecture well-suited for their execution. Moreover, software and …

Save Cite Related articles

Treelet prefetching for ray tracing

YH Chou, T Nowicki, TM Aamodt - Proceedings of the 56th Annual IEEE …, 2023 - dl.acm.org

Ray tracing is traditionally only used in offline rendering to produce images of high fidelity
because it is computationally expensive. Recent Graphics Processing Units (GPUs) have …

Save Cite Cited by 4 Related articles All 3 versions

[PDF] purdue.edu

CRISP: Concurrent Rendering and Compute Simulation Platform for GPUs

J Pan, TG Rogers - 2024 IEEE International Symposium on …, 2024 - ieeexplore.ieee.org

… We would like to thank Cesar Avalos for his help in the project. We would also like to
thank Shichen Qiao and Matthew D. Sinclair for their work on per-stream stat in GPGPU-Sim. …

[PDF] academia.edu

[PDF][PDF] Principal Kernel Analysis: A Tractable Methodology to Simulate Scaled GPU Workloads

M Payer, TG Rogers - 2021 - academia.edu

Simulating all threads in a scaled GPU workload results in prohibitive simulation cost. Cycle-level
simulation is orders of magnitude slower than native silicon, the only solution is to …

Save Cite Related articles View as HTML

[PDF] github.io

[PDF][PDF] Accelerating the Evaluation of Large Workloads on Post-Dennard Systems using Sampling

A Sabu - alenks.github.io

With the end of Moore’s law, computer architects have turned to alternative approaches to
enhance computational capabilities. One prominent strategy involves a shift towards …

Save Cite Related articles View as HTML

[PDF] arxiv.org

Data-driven Forecasting of Deep Learning Performance on GPUs

S Lee, A Phanishayee, D Mahajan - arXiv preprint arXiv:2407.13853, 2024 - arxiv.org

Deep learning kernels exhibit predictable memory accesses and compute patterns, making
GPUs' parallel architecture well-suited for their execution. Software and runtime systems for …

[PDF] acm.org

Photon: A fine-grained sampled simulation methodology for GPU workloads

C Liu, Y Sun, TE Carlson - Proceedings of the 56th Annual IEEE/ACM …, 2023 - dl.acm.org

GPUs, due to their massively-parallel computing architectures, provide high performance for
data-parallel applications. However, existing GPU simulators are too slow to enable …

Save Cite Cited by 6 Related articles All 6 versions

[PDF] acm.org

Path Forward Beyond Simulators: Fast and Accurate GPU Execution Time Prediction for DNN Workloads

Y Li, Y Sun, A Jog - Proceedings of the 56th Annual IEEE/ACM …, 2023 - dl.acm.org

Today, DNNs’ high computational complexity and sub-optimal device utilization present a
major roadblock to democratizing DNNs. To reduce the execution time and improve device …

Save Cite Cited by 12 Related articles All 4 versions

[PDF] psu.edu

Development Of A Heterogeneous Architecture Simulation Framework

S Mohapatra - 2022 - etda.libraries.psu.edu

Heterogenous systems consisting of processors of varying nature which complement each
other’s deficiencies are rapidly eclipsing the homogeneous systems of past. The consumer …

Save Cite Related articles View as HTML

Create alert

Cite

Advanced search

Saved to My library

Principal kernel analysis: A tractable methodology to simulate scaled GPU workloads

Forecasting GPU Performance for Deep Learning Training and Inference

Treelet prefetching for ray tracing

CRISP: Concurrent Rendering and Compute Simulation Platform for GPUs

[PDF][PDF] Principal Kernel Analysis: A Tractable Methodology to Simulate Scaled GPU Workloads

[PDF][PDF] Accelerating the Evaluation of Large Workloads on Post-Dennard Systems using Sampling

Data-driven Forecasting of Deep Learning Performance on GPUs

Photon: A fine-grained sampled simulation methodology for GPU workloads

Path Forward Beyond Simulators: Fast and Accurate GPU Execution Time Prediction for DNN Workloads

Development Of A Heterogeneous Architecture Simulation Framework