Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3105762.3105771acmconferencesArticle/Chapter ViewAbstractPublication PageshpgConference Proceedingsconference-collections
research-article
Public Access

Dual streaming for hardware-accelerated ray tracing

Published: 28 July 2017 Publication History

Abstract

Hardware acceleration for ray tracing has been a topic of great interest in computer graphics. However, even with proposed custom hardware, the inherent irregularity in the memory access pattern of ray tracing has limited its performance, compared with rasterization on commercial GPUs. We provide a different approach to hardware-accelerated ray tracing, beginning with modifying the order of rendering operations, inspired by the streaming character of rasterization. Our dual streaming approach organizes the memory access of ray tracing into two predictable data streams. The predictability of these streams allows perfect prefetching and makes the memory access pattern an excellent match for the behavior of DRAM memory systems. By reformulating ray tracing as fully predictable streams of rays and of geometry we alleviate many long-standing problems of high-performance ray tracing and expose new opportunities for future research. Therefore, we also include extensive discussions of potential avenues for future research aimed at improving the performance of hardware-accelerated ray tracing using dual streaming.

References

[1]
Timo Aila and Tero Karras. 2010. Architecture Considerations for Tracing Incoherent Rays. In Proc. High Performance Graphics.
[2]
Timo Aila and Samuli Laine. 2009. Understanding the efficiency of ray traversal on GPUs. In Proc. High Performance Graphics. ACM, New York, NY, USA, 145--149.
[3]
Timo Aila, Samuli Laine, and Tero Karras. 2012. Understanding the Efficiency of Ray Traversal on GPUs - Kepler and Fermi Addendum. NVIDIA Technical Report NVR-2012-02. NVIDIA Corporation.
[4]
R. Balasubramonian, D.H. Albonesi, A. Buyuktosunoglu, and S. Dwarkadas. 2000. Memory Hierarchy Reconfiguration for Energy and Performance in General-Purpose Processor Architectures. In Proceedings of MICRO-33. 245--257.
[5]
Rasmus Barringer and Tomas Akenine-Möller. 2014. Dynamic ray stream traversal. ACM Transactions on Graphics (TOG) 33, 4 (2014), 151.
[6]
James Bigler, Abe Stephens, and Steven G. Parker. 2006. Design for Parallel Interactive Ray Tracing Systems. In Symposium on Interactive Ray Tracing (IRT06).
[7]
Jacco Bikker. 2012. Improving Data Locality for Efficient In-Core Path Tracing. In Computer Graphics Forum, Vol. 31. 1936--1947.
[8]
Mahdi Nazm Bojnordi and Engin Ipek. 2012. PARDIS: A Programmable Memory Controller for the DDRx Interfacing Standards. In International Symposium on Computer Architecture (ISCA '12).
[9]
Solomon Boulos, Dave Edwards, J Dylan Lacewell, Joe Kniss, Jan Kautz, Peter Shirley, and Ingo Wald. 2007. Packet-based Whitted and Distribution Ray Tracing. In Proc. Graphics Interface.
[10]
Erik Brunvand, Daniel Kopta, and Niladrish Chatterjee. 2014. Why Graphics Programmers Need to Know About DRAM. In ACM SIGGRAPH 2014 Courses.
[11]
N. Chatterjee, R. Balasubramonian, M. Shevgoor, S. Pugsley, A. Udipi, A. Shafiee, K. Sudan, M. Awasthi, and Z. Chishti. 2012. USIMM: the Utah SImulated Memory Module. Technical Report UUCS-12-02. University of Utah.
[12]
C. Eisenacher, G. Nichols, A. Selle, and B. Burley. 2013. Sorted Deferred Shading for Production Path Tracing. Computer Graphics Forum 32, 4 (2013).
[13]
Venkatraman Govindaraju, Peter Djeu, Karthikeyan Sankaralingam, Mary Vernon, and William R. Mark. 2008. Toward A Multicore Architecture for Real-time Ray-tracing. In IEEE/ACM International Conference on Microarchitecture.
[14]
Christiaan Gribble and Karthik Ramani. 2008. Coherent Ray Tracing via Stream Filtering. In Symposium on Interactive Ray Tracing (IRT08).
[15]
Bruce Jacob, Spencer Ng, and David Wang. 2008. Memory Systems - Cache, DRAM, Disk. Elsevier.
[16]
JDEC Standard. 2015. High Bandwidth Memory (HBM) DRAM. Technical Report JESD325A. JDEC Solid State Technology Association.
[17]
James T. Kajiya. 1986. The Rendering Equation. In Proceedings of SIGGRAPH. 143--150.
[18]
Sean Keely. 2014. Reduced Precision for Hardware Ray Tracing in GPUs. In High-Performance Graphics (HPG 2014).
[19]
John Kelm, Daniel Johnson, Matthew Johnson, Neal Crago, William Tuohy, Aqeel Mahesri, Steven Lumetta, Matthew Frank, and Sanjay Patel. 2009. Rigel: an architecture and scalable programming interface for a 1000-core accelerator. In ISCA '09.
[20]
Hong-Yun Kim, Young-Jun Kim, and Lee-Sup Kim. 2010. Reconfigurable mobile stream processor for ray tracing. In Custom Integrated Circuits Conference (CICC).
[21]
Hong-Yun Kim, Young-Jun Kim, and Lee-Sup Kim. 2012. MRTP: Mobile Ray Tracing Processor With Reconfigurable Stream Multi-Processors for High Datapath Utilization. IEEE Journal of Solid-State Circuits 47, 2 (feb. 2012), 518--535.
[22]
Daniel Kopta, Konstantin Shkurko, Josef Spjut, Erik Brunvand, and Al Davis. 2013. An energy and bandwidth efficient ray tracing architecture. In Proc. High-Performance Graphics. ACM, 121--128.
[23]
Daniel Kopta, Konstantin Shkurko, Josef Spjut, Erik Brunvand, and Al Davis. 2015. Memory Considerations for Low Energy Ray Tracing. Computer Graphics Forum 34, 1 (2015), 47--59.
[24]
Daniel Kopta, Josef Spjut, Erik Brunvand, and Alan Davis. 2010. Efficient MIMD architectures for high-performance ray tracing. In IEEE International Conference on Computer Design (ICCD).
[25]
Won-Jong Lee, Shi-Hwa Lee, Jae-Ho Nah, Jin-Woo Kim, Youngsam Shin, Jaedon Lee, and Seok-Yoon Jung. 2012. SGRT: a scalable mobile GPU architecture based on ray tracing. In ACM SIGGRAPH 2012 Posters (SIGGRAPH '12).
[26]
Won-Jong Lee, Youngsam Shin, Seok Joong Hwang, Seok Kang, Jeong-Joon Yoo, and Soojung Ryu. 2015. Reorder buffer: an energy-efficient multithreading architecture for hardware MIMD ray traversal. In Proc.High-Performance Graphics. ACM, 21--32.
[27]
Gábor Liktor and Karthik Vaidyanathan. 2016. Bandwidth-efficient BVH Layout for Incremental Hardware Traversal. In Proc. High Performance Graphics. ACM.
[28]
B. Moon, Y. Byun, T.-J. Kim, P. Claudio, H.-S. Kim, Y.-J. Ban, S. W. Nam, and S.-E. Yoon. 2010. Cache-oblivious ray reordering. ACM Trans. Graph. 29, 3 (2010).
[29]
N. Muralimanohar, R. Balasubramonian, and N. Jouppi. 2007. Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0. In MICRO.
[30]
Jae-Ho Nah, Hyuck-Joo Kwon, Dong-Seok Kim, Cheol-Ho Jeong, Jinhong Park, Tack-Don Han, Dinesh Manocha, and Woo-Chan Park. 2014. RayCore: A Ray-Tracing Hardware Architecture for Mobile Devices. ACM Trans. Graph. 33, 5 (Sept. 2014).
[31]
Paul Navrátil, Donald Fussell, Calvin Lin, and William Mark. 2007. Dynamic ray scheduling to improve ray coherence and bandwidth utilization. In Interactive Ray Tracing, 2007. IEEE Symposium on. 95--104.
[32]
Steven G. Parker, James Bigler, Andreas Dietrich, Heiko Friedrich, Jared Hoberock, David Luebke, David McAllister, Morgan McGuire, Keith Morley, Austin Robison, and Martin Stich. 2010. OptiX: a general purpose ray tracing engine. In ACM SIGGRAPH 2010 papers (SIGGRAPH '10).
[33]
Matt Pharr, Craig Kolb, Reid Gershbein, and Pat Hanrahan. 1997. Rendering complex scenes with memory-coherent ray tracing. In SIGGRAPH '97. 101--108.
[34]
Timothy J. Purcell, Ian Buck, William R. Mark, and Pat Hanrahan. 2002. Ray Tracing on Programmable Graphics Hardware. ACM Transactions on Graphics 21, 3 (2002).
[35]
Karthik Ramani and Christiaan Gribble. 2009. StreamRay: A Stream Filtering Architecture for Coherent Ray Tracing. In ASPLOS '09.
[36]
J. Schmittler, I. Wald, and P. Slusallek. 2002. SaarCOR - A Hardware Architecture for Realtime Ray-Tracing. In EUROGRAPHICS Workshop on Graphics Hardware.
[37]
J. Schmittler, S. Woop, D. Wagner, W. Paul, and P. Slusallek. 2004. Realtime Ray Tracing of Dynamic Scenes on an FPGA Chip. In Graphics Hardware Conference. 95--106.
[38]
Maxim Shevtsov, Alexei Soupikov, Alexander Kapustin, and Nizhniy Novorod. 2007. Ray-Triangle Intersection Algorithm for Modern CPU Architectures. In Procedings of GraphiCon'2007. Moscow, Russia.
[39]
Josef Spjut, Andrew Kensler, Daniel Kopta, and Erik Brunvand. 2009. TRaX: A Multicore Hardware Architecture for Real-Time Ray Tracing. IEEE Trans. on CAD 28, 12 (2009).
[40]
Josef Spjut, Daniel Kopta, Solomon Boulos, Spencer Kellis, and Erik Brunvand. 2008. TRaX: A Multi-Threaded Architecture for Real-Time Ray Tracing. In IEEE Symposium on Application Specific Processors (SASP).
[41]
Ingo Wald, Christiaan P. Gribble, Solomon Boulos, and Andrew Kensler. 2007. SIMD Ray Stream Tracing-SIMD Ray Traversal with Generalized Ray Packets and On-the-fly Re-Ordering. Technical Report UUSCI-2007-012. SCI Institute, University of Utah.
[42]
I. Wald, S. Woop, C. Benthin, G. Johnson, and M. Ernst. 2014. Embree - A Kernel Framework for Efficient CPU Ray Tracing. In ACM SIGGRAPH.
[43]
Amy Williams, Steve Barrus, R. Keith Morley, and Peter Shirley. 2005. An Efficient and Robust Ray-Box Intersection Algorithm. Journal of Graphics Tools 10, 1 (2005).
[44]
Sven Woop, Erik Brunvand, and Philipp Slusallak. 2006. Estimating Performance of a Ray Tracing ASIC Design. In IRT06.
[45]
Sven Woop, Jörg Schmittler, and Philipp Slusallek. 2005. RPU: A Programmable Ray Processing Unit for Realtime Ray Tracing. ACM Trans. on Graphics 24, 3 (July 2005).
[46]
Wm. A. Wulf and S.A. McKee. 1995. Hitting the Memory Wall: Implications of the Obvious. Computer Architecture News 23, 1 (March 1995), 20--24.
[47]
Sung-Eui Yoon and Dinesh Manocha. 2006. Cache-Efficient Layouts of Bounding Volume Hierarchies. In Computer Graphics Forum, Vol. 25. 507--516.

Cited By

View all
  • (2024)Potamoi: Accelerating Neural Rendering via a Unified Streaming ArchitectureACM Transactions on Architecture and Code Optimization10.1145/368934021:4(1-25)Online publication date: 20-Nov-2024
  • (2024)MPRTA: An Efficient Multilevel Parallel Mobile Accelerator for High-Performance Ray TracingIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2023.333471132:2(396-400)Online publication date: Feb-2024
  • (2024)Cicero: Addressing Algorithmic and Architectural Bottlenecks in Neural Rendering by Radiance Warping and Memory Optimizations2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00096(1293-1308)Online publication date: 29-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
HPG '17: Proceedings of High Performance Graphics
July 2017
180 pages
ISBN:9781450351010
DOI:10.1145/3105762
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 July 2017

Permissions

Request permissions for this article.

Check for updates

Author Tag

  1. raytracing hardware

Qualifiers

  • Research-article

Funding Sources

Conference

HPG '17
Sponsor:
HPG '17: High-Performance Graphics
July 28 - 30, 2017
California, Los Angeles

Acceptance Rates

Overall Acceptance Rate 15 of 44 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)235
  • Downloads (Last 6 weeks)20
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Potamoi: Accelerating Neural Rendering via a Unified Streaming ArchitectureACM Transactions on Architecture and Code Optimization10.1145/368934021:4(1-25)Online publication date: 20-Nov-2024
  • (2024)MPRTA: An Efficient Multilevel Parallel Mobile Accelerator for High-Performance Ray TracingIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2023.333471132:2(396-400)Online publication date: Feb-2024
  • (2024)Cicero: Addressing Algorithmic and Architectural Bottlenecks in Neural Rendering by Radiance Warping and Memory Optimizations2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00096(1293-1308)Online publication date: 29-Jun-2024
  • (2023)Treelet Prefetching For Ray TracingProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614288(742-755)Online publication date: 28-Oct-2023
  • (2022)RT Engine: An Efficient Hardware Architecture for Ray TracingApplied Sciences10.3390/app1219959912:19(9599)Online publication date: 24-Sep-2022
  • (2022)RTNNProceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3503221.3508409(76-89)Online publication date: 2-Apr-2022
  • (2022)Mach-RT: A Many Chip Architecture for High Performance Ray TracingIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2020.302104828:3(1585-1596)Online publication date: 1-Mar-2022
  • (2022)RTA: an Efficient SIMD Architecture for Ray Tracing2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00040(43-50)Online publication date: Dec-2022
  • (2021)Intersection Prediction for Accelerated GPU Ray TracingMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480097(709-723)Online publication date: 18-Oct-2021
  • (2021)A Survey on Bounding Volume Hierarchies for Ray TracingComputer Graphics Forum10.1111/cgf.14266240:2(683-712)Online publication date: 4-Jun-2021
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media