Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3572848.3577475acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article
Open access

High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs

Published: 21 February 2023 Publication History

Abstract

While parallelism remains the main source of performance, architectural implementations and programming models change with each new hardware generation, often leading to costly application re-engineering. Most tools for performance portability require manual and costly application porting to yet another programming model.
We propose an alternative approach that automatically translates programs written in one programming model (CUDA), into another (CPU threads) based on Polygeist/MLIR. Our approach includes a representation of parallel constructs that allows conventional compiler transformations to apply transparently and without modification and enables parallelism-specific optimizations. We evaluate our framework by transpiling and optimizing the CUDA Rodinia benchmark suite for a multi-core CPU and achieve a 58% geomean speedup over handwritten OpenMP code. Further, we show how CUDA kernels from PyTorch can efficiently run and scale on the CPU-only Supercomputer Fugaku without user intervention. Our PyTorch compatibility layer making use of transpiled CUDA PyTorch kernels outperforms the PyTorch CPU native backend by 2.7×.

References

[1]
Alexander Aiken and David Gay. 1998. Barrier Inference. In Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (San Diego, California, USA) (POPL '98). Association for Computing Machinery, New York, NY, USA, 342--354.
[2]
David Beckingsale, Richard Hornung, Tom Scogland, and Arturo Vargas. 2019. Performance Portable C++ Programming with RAJA. In Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming (Washington, District of Columbia) (PPoPP '19). Association for Computing Machinery, New York, NY, USA, 455--456.
[3]
H. Carter Edwards, Christian R. Trott, and Daniel Sunderland. 2014. Kokkos: Enabling manycore performance portability through polymorphic memory access patterns. J. Parallel and Distrib. Comput. 74, 12 (2014), 3202--3216. Domain-Specific Languages and High-Level Frameworks for High-Performance Computing.
[4]
Prasanth Chatarasi, Jun Shirako, and Vivek Sarkar. 2015. Polyhedral Optimizations of Explicitly Parallel Programs. In 2015 International Conference on Parallel Architecture and Compilation (PACT). 213--226.
[5]
Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W. Sheaffer, Sang-Ha Lee, and Kevin Skadron. 2009. Rodinia: A benchmark suite for heterogeneous computing. In 2009 IEEE International Symposium on Workload Characterization (IISWC). 44--54.
[6]
Valentin Churavy, Dilum Aluthge, Lucas C Wilcox, Simon Byrne, Maciej Waruszewski, Ali Ramadhan, Meredith, Simeon Schaub, James Schloss, Julian Samaroo, Jake Bolewski, Charles Kawczynski, Jeremy E Kozdon, Jinguo Liu, Oliver Schulz, Oscar, Páll Haraldsson, Takafumi Arakaki, and Tim Besard. 2022. JuliaGPU/KernelAbstractions.jl: v0.8.0.
[7]
R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. 1989. An Efficient Method of Computing Static Single Assignment Form. In Proceedings of the 16th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (Austin, Texas, USA) (POPL '89). Association for Computing Machinery, New York, NY, USA, 25--35.
[8]
Alain Darte and Robert Schreiber. 2005. A Linear-Time Algorithm for Optimal Barrier Placement. In Proceedings of the Tenth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Chicago, IL, USA) (PPoPP '05). Association for Computing Machinery, New York, NY, USA, 26--35.
[9]
Gregory Diamos, Andrew Kerr, Sudhakar Yalamanchili, and Nathan Clark. 2010. Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems. In 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT). IEEE, 353--364.
[10]
Johannes Doerfert, Jose Manuel Monsalve Diaz, and Hal Finkel. 2019. The TRegion Interface and Compiler Optimizations for OpenMP Target Regions. In OpenMP: Conquering the Full Hardware Spectrum - 15th International Workshop on OpenMP, IWOMP 2019, Auckland, New Zealand, September 11--13, 2019, Proceedings (Lecture Notes in Computer Science, Vol. 11718), Xing Fan, Bronis R. de Supinski, Oliver Sinnen, and Nasser Giacaman (Eds.). Springer, 153--167.
[11]
Johannes Doerfert and Hal Finkel. 2018. Compiler Optimizations for OpenMP. In Evolving OpenMP for Evolving Architectures, Bronis R. de Supinski, Pedro Valero-Lara, Xavier Martorell, Sergi Mateo Bellido, and Jesus Labarta (Eds.). Springer International Publishing, Cham, 113--127.
[12]
Johannes Doerfert and Hal Finkel. 2018. Compiler Optimizations for Parallel Programs. In Languages and Compilers for Parallel Computing - 31st International Workshop, LCPC 2018, Salt Lake City, UT, USA, October 9--11, 2018, Revised Selected Papers (Lecture Notes in Computer Science, Vol. 11882), Mary W. Hall and Hari Sundar (Eds.). Springer, 112--119.
[13]
Aleksandr Drozd. 2021. Benchmarker. Online GitHub repository: https://github.com/undertherain/benchmarker/, commit e1f22da320b0c7384cbd2f4df50255c7c2fa6b9d.
[14]
Peng Du, Rick Weber, Piotr Luszczek, Stanimire Tomov, Gregory Peterson, and Jack Dongarra. 2012. From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming. Parallel Comput. 38, 8 (2012), 391--407.
[15]
H Carter Edwards, Christian R Trott, and Daniel Sunderland. 2014. Kokkos: Enabling manycore performance portability through polymorphic memory access patterns. Journal of parallel and distributed computing 74, 12 (2014), 3202--3216.
[16]
Paul Feautrier and Christian Lengauer. 2011. Polyhedron Model. Encyclopedia of parallel computing (2011), 1581--1592.
[17]
Franz Franchetti, Tze Meng Low, Doru Thom Popovici, Richard M. Veras, Daniele G. Spampinato, Jeremy R. Johnson, Markus Püschel, James C. Hoe, and José M. F. Moura. 2018. SPIRAL: Extreme Performance Portability. Proc. IEEE 106, 11 (2018), 1935--1968.
[18]
Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. 1998. The Implementation of the Cilk-5 Multithreaded Language. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation (Montreal, Quebec, Canada) (PLDI '98). Association for Computing Machinery, New York, NY, USA, 212--223.
[19]
Fujitsu. 2021. https://www.fujitsu.com/downloads/SUPER/a64fx/a64fx_datasheet_en.pdf
[20]
Fujitsu. 2022. https://github.com/fujitsu/dnnl_aarch64
[21]
Tobias Gysi, Christoph Müller, Oleksandr Zinenko, Stephan Herhut, Eddie Davis, Tobias Wicky, Oliver Fuhrer, Torsten Hoefler, and Tobias Grosser. 2021. Domain-Specific Multi-Level IR Rewriting for GPU: The Open Earth Compiler for GPU-Accelerated Climate Simulation. ACM Trans. Archit. Code Optim. 18, 4, Article 51 (sep 2021), 23 pages.
[22]
Hwansoo Han, Chau-Wen Tseng, and Pete Keleher. 1998. Eliminating barrier synchronization for compiler-parallelized codes on software DSMs. International journal of parallel programming 26, 5 (1998), 591--612.
[23]
Ruobing Han, Jaewon Lee, Jaewoong Sim, and Hyesoon Kim. 2022. COX: CUDA on X86 by Exposing Warp-Level Functions to CPUs. ACM Trans. Archit. Code Optim. (jul 2022).
[24]
Mark Harris et al. 2007. Optimizing parallel reduction in CUDA. Nvidia developer technology 2, 4 (2007), 70.
[25]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[26]
J. A. Herdman, W. P. Gaudin, O. Perks, D. A. Beckingsale, A. C. Mallinson, and S. A. Jarvis. 2014. Achieving Portability and Performance through OpenACC. In 2014 First Workshop on Accelerator Programming using Directives. 19--26.
[27]
Chuntao Hong, Dehao Chen, Wenguang Chen, Weimin Zheng, and Haibo Lin. 2010. MapCG: Writing Parallel Program Portable between CPU and GPU. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (Vienna, Austria) (PACT '10). Association for Computing Machinery, New York, NY, USA, 217--226.
[28]
Intel. 2022. OneAPI Deep Neural Network Library (OneDNN). https://github.com/oneapi-src/oneDNN
[29]
Pekka Jääskeläinen, Carlos Sánchez de La Lama, Erik Schnetter, Kalle Raiskila, Jarmo Takala, and Heikki Berg. 2015. pocl: A performance-portable OpenCL implementation. International Journal of Parallel Programming 43, 5 (2015), 752--785.
[30]
Ralf Karrenberg and Sebastian Hack. 2012. Improving performance of OpenCL on CPUs. In Compiler Construction, Michael O'Boyle (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 1--20.
[31]
Andreas Klöckner. 2014. Loo.Py: Transformation-Based Code Generation for GPUs and CPUs. In Proceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming (ARRAY'14) (Edinburgh, United Kingdom). Association for Computing Machinery, New York, NY, USA, 82--87.
[32]
Maria Kotsifakou, Prakalp Srivastava, Matthew D. Sinclair, Rakesh Komuravelli, Vikram Adve, and Sarita Adve. 2018. HPVM: Heterogeneous parallel virtual machine. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Vienna, Austria) (PPoPP '18). Association for Computing Machinery, New York, NY, USA, 68--80.
[33]
C. Lattner and V. Adve. 2004. LLVM: a compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization, 2004. CGO 2004. 75--86.
[34]
Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache, and Oleksandr Zinenko. 2021. MLIR: Scaling Compiler Infrastructure for Domain Specific Computation. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 2--14.
[35]
Amy W. Lim and Monica S. Lam. 1997. Maximizing Parallelism and Minimizing Synchronization with Affine Transforms. In Proceedings of the 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (Paris, France) (POPL '97). Association for Computing Machinery, New York, NY, USA, 201--214.
[36]
LLVM Contributors. 2021. OpenMP-aware optimizations. Online: https://openmp.llvm.org/optimizations/OpenMPOpt.html.
[37]
Simon Moll, Johannes Doerfert, and Sebastian Hack. 2016. Input Space Splitting for OpenCL. In Proceedings of the 25th International Conference on Compiler Construction (Barcelona, Spain) (CC 2016). Association for Computing Machinery, New York, NY, USA, 251--260.
[38]
Sungdo Moon and Mary W Hall. 1999. Evaluation of predicated array data-flow analysis for automatic parallelization. ACM SIGPLAN Notices 34, 8 (1999), 84--95.
[39]
William Steven Moses. 2017. How should compilers represent fork-join parallelism? Master's thesis. Massachusetts Institute of Technology.
[40]
William S. Moses, Lorenzo Chelini, Ruizhe Zhao, and Oleksandr Zinenko. 2021. Polygeist: Raising C to Polyhedral MLIR. In 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT). 45--59.
[41]
William S. Moses, Valentin Churavy, Ludger Paehler, Jan Hückelheim, Sri Hari Krishna Narayanan, Michel Schanen, and Johannes Doerfert. 2021. Reverse-Mode Automatic Differentiation and Optimization of GPU Kernels via Enzyme. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (St. Louis, Missouri) (SC '21). Association for Computing Machinery, New York, NY, USA, Article 61, 16 pages.
[42]
Cosmin E Oancea and Lawrence Rauchwerger. 2012. Logical inference techniques for loop parallelization. In Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation. 509--520.
[43]
M. O'Boyle and E. Stohr. 2002. Compile time barrier synchronization minimization. IEEE Transactions on Parallel and Distributed Systems 13, 6 (2002), 529--543.
[44]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. Py-Torch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2019/file/bdbca288fee7f92f2bfa9f7012727740-Paper.pdf
[45]
Atmn Patel, Shilei Tian, Johannes Doerfert, and Barbara Chapman. 2021. A Virtual GPU as Developer-Friendly OpenMP Offload Target. In 50th International Conference on Parallel Processing Workshop (Lemont, IL, USA) (ICPP Workshops '21). Association for Computing Machinery, New York, NY, USA, Article 24, 7 pages.
[46]
Matt Pharr and William R Mark. 2012. ispc: A SPMD compiler for high-performance CPU programming. In 2012 Innovative Parallel Computing (InPar). IEEE, 1--13.
[47]
Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Re-computation in Image Processing Pipelines. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (Seattle, Washington, USA) (PLDI '13). Association for Computing Machinery, New York, NY, USA, 519--530.
[48]
Harenome Razanajato, Cidric Bastoul, and Vincent Loechner. 2017. Lifting Barriers Using Parallel Polyhedral Regions. In 2017 IEEE 24th International Conference on High Performance Computing (HiPC). 338--347.
[49]
Mitsuhisa Sato, Yutaka Ishikawa, Hirofumi Tomita, Yuetsu Kodama, Tetsuya Odajima, Miwako Tsuji, Hisashi Yashiro, Masaki Aoki, Naoyuki Shida, Ikuo Miyoshi, Kouichi Hirai, Atsushi Furuya, Akira Asato, Kuniki Morita, and Toshiyuki Shimizu. 2020. Co-Design for A64FX Manycore Processor and "Fugaku". In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. 1--15.
[50]
Tao B Schardl, William S Moses, and Charles E Leiserson. 2019. Tapir: Embedding recursive fork-join parallelism into LLVM's intermediate representation. ACM Transactions on Parallel Computing (TOPC) 6, 4 (2019), 1--33.
[51]
Adrian Schmitz, Julian Miller, Lukas Trümper, and Matthias S Müller. 2021. PPIR: Parallel Pattern Intermediate Representation. In 2021 IEEE/ACM International Workshop on Hierarchical Parallelism for Exascale Computing (HiPar). IEEE, 30--40.
[52]
Alexander Sergeev and Mike Del Balso. 2018. Horovod: fast and easy distributed deep learning in TensorFlow.
[53]
Tyler Sorensen, Alastair F. Donaldson, Mark Batty, Ganesh Gopalakrishnan, and Zvonimir Rakamarić. 2016. Portable Inter-Workgroup Barrier Synchronisation for GPUs. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (Amsterdam, Netherlands) (OOP-SLA 2016). Association for Computing Machinery, New York, NY, USA, 39--58.
[54]
Tyler Sorensen, Lucas F Salvador, Harmit Raval, Hugues Evrard, John Wickerson, Margaret Martonosi, and Alastair F Donaldson. 2021. Specifying and testing GPU workgroup progress models. Proceedings of the ACM on Programming Languages 5, OOPSLA (2021), 1--30.
[55]
George Stelle, William S. Moses, Stephen L. Olivier, and Patrick McCormick. 2017. OpenMPIR: Implementing OpenMP Tasks with Tapir. In Proceedings of the Fourth Workshop on the LLVM Compiler Infrastructure in HPC (Denver, CO, USA) (LLVM-HPC'17). Association for Computing Machinery, New York, NY, USA, Article 3, 12 pages.
[56]
George Stelle, William S. Moses, Stephen L. Olivier, and Patrick McCormick. 2017. OpenMPIR: Implementing OpenMP Tasks with Tapir. In Proceedings of the Fourth Workshop on the LLVM Compiler Infrastructure in HPC (Denver, CO, USA). ACM, New York, NY, USA, Article 3, 12 pages.
[57]
John A. Stratton, Vinod Grover, Jaydeep Marathe, Bastiaan Aarts, Mike Murphy, Ziang Hu, and Wen-mei W. Hwu. 2010. Efficient Compilation of Fine-Grained SPMD-Threaded Programs for Multicore CPUs. In Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization (Toronto, Ontario, Canada) (CGO '10). Association for Computing Machinery, New York, NY, USA, 111--119.
[58]
John A. Stratton, Sam S. Stone, and Wen-mei W. Hwu. 2008. MCUDA: An Efficient Implementation of CUDA Kernels for Multi-core CPUs. In Languages and Compilers for Parallel Computing, José Nelson Amaral (Ed.). Vol. 5335. Springer, Berlin, Heidelberg, 16--30. Series Title: Lecture Notes in Computer Science.
[59]
Sander Stuijk, Marc Geilen, and Twan Basten. 2006. Sdfˆ 3: Sdf for free. In Sixth International Conference on Application of Concurrency to System Design (ACSD'06). IEEE, IEEE, 276--278.
[60]
Zehra Sura, Xing Fang, Chi-Leung Wong, Samuel P. Midkiff, Jaejin Lee, and David Padua. 2005. Compiler Techniques for High Performance Sequentially Consistent Java Programs. In Proceedings of the Tenth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Chicago, IL, USA) (PPoPP '05). Association for Computing Machinery, New York, NY, USA, 2--13.
[61]
Xinmin Tian, Hideki Saito, Ernesto Su, Jin Lin, Satish Guggilla, Diego Caballero, Matt Masten, Andrew Savonichev, Michael Rice, Elena Demikhovsky, Ayal Zaks, Gil Rapaport, Abhinav Gaba, Vasileios Porpodas, and Eric N. Garcia. 2017. LLVM Compiler Implementation for Explicit Parallelization and SIMD Vectorization. In Proceedings of the Fourth Workshop on the LLVM Compiler Infrastructure in HPC. ACM, Denver, CO, USA, 4:1--4:11.
[62]
Chau-Wen Tseng. 1995. Compiler optimizations for eliminating barrier synchronization. ACM SIGPLAN Notices 30, 8 (1995), 144--155.
[63]
Nicolas Vasilache, Benoit Meister, Muthu Baskaran, and Richard Lethin. 2012. Joint scheduling and layout optimization to enable multi-level vectorization. IMPACT 12 (2012).
[64]
Nicolas Vasilache, Oleksandr Zinenko, Theodoros Theodoridis, Priya Goyal, Zachary Devito, William S. Moses, Sven Verdoolaege, Andrew Adams, and Albert Cohen. 2019. The Next 700 Accelerated Layers: From Mathematical Expressions of Network Computation Graphs to Accelerated GPU Kernels, Automatically. ACM Trans. Archit. Code Optim. 16, 4, Article 38 (oct 2019), 26 pages.
[65]
Oleksandr Zinenko, Sven Verdoolaege, Chandan Reddy, Jun Shirako, Tobias Grosser, Vivek Sarkar, and Albert Cohen. 2018. Modeling the Conflicting Demands of Parallelism and Temporal/Spatial Locality in Affine Scheduling. In Proceedings of the 27th International Conference on Compiler Construction (Vienna, Austria) (CC 2018). Association for Computing Machinery, New York, NY, USA, 3--13.

Cited By

View all
  • (2024)CuPBoP: Making CUDA a Portable LanguageACM Transactions on Design Automation of Electronic Systems10.1145/365994929:4(1-25)Online publication date: 21-Jun-2024
  • (2024)Fast Template-Based Code Generation for MLIRProceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction10.1145/3640537.3641567(1-12)Online publication date: 17-Feb-2024
  • (2024)Retargeting and Respecializing GPU Workloads for Performance PortabilityProceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO57630.2024.10444828(119-132)Online publication date: 2-Mar-2024
  • Show More Cited By

Index Terms

  1. High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      PPoPP '23: Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming
      February 2023
      480 pages
      ISBN:9798400700156
      DOI:10.1145/3572848
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 21 February 2023

      Check for updates

      Badges

      Author Tags

      1. CUDA
      2. MLIR
      3. barrier synchronization
      4. polygeist

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      PPoPP '23

      Acceptance Rates

      Overall Acceptance Rate 230 of 1,014 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1,309
      • Downloads (Last 6 weeks)175
      Reflects downloads up to 19 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)CuPBoP: Making CUDA a Portable LanguageACM Transactions on Design Automation of Electronic Systems10.1145/365994929:4(1-25)Online publication date: 21-Jun-2024
      • (2024)Fast Template-Based Code Generation for MLIRProceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction10.1145/3640537.3641567(1-12)Online publication date: 17-Feb-2024
      • (2024)Retargeting and Respecializing GPU Workloads for Performance PortabilityProceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO57630.2024.10444828(119-132)Online publication date: 2-Mar-2024
      • (2024)Strided Difference Bound MatricesComputer Aided Verification10.1007/978-3-031-65627-9_14(279-302)Online publication date: 24-Jul-2024
      • (2023)Implementation Techniques for SPMD Kernels on CPUsProceedings of the 2023 International Workshop on OpenCL10.1145/3585341.3585342(1-12)Online publication date: 18-Apr-2023
      • (2023)CNT: Semi-Automatic Translation from CWL to Nextflow for Genomic Workflows2023 IEEE 23rd International Conference on Bioinformatics and Bioengineering (BIBE)10.1109/BIBE60311.2023.00012(22-27)Online publication date: 4-Dec-2023
      • (2023)MLIRSmith: Random Program Generation for Fuzzing MLIR Compiler Infrastructure2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE)10.1109/ASE56229.2023.00120(1555-1566)Online publication date: 11-Sep-2023

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media