Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3662010.3663449acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Open access

So Far and yet so Near - Accelerating Distributed Joins with CXL

Published: 09 June 2024 Publication History

Abstract

Distributed partitioned joins are one of the most expensive operators in distributed DBMSs where a major part of the execution is attributed to network transfer costs. Although high-speed network technologies, such as RDMA, can lower this cost, they still come with significantly higher latency than local DRAM access. The emerging CXL interconnect protocol promises to provide direct and cache-coherent access to remote memory while offering byte-addressable memory access without CPU intervention. For short-distance communication in distributed DBMSs, CXL represents an interesting alternative for low-latency requirements. In this work, we explore how CXL can be leveraged for engine-internal communication and data exchange. We discuss and apply communication strategies to distributed joins. We emulate various CXL characteristics based on optimistic and pessimistic assumptions on the real performance of upcoming CXL devices and evaluate their impact on the execution of distributed joins. Our results show that CXL has the potential to improve distributed join performance.

References

[1]
Minseon Ahn, Andrew Chang, Donghun Lee, Jongmin Gim, Jungmin Kim, Jaemin Jung, Oliver Rebholz, Vincent Pham, Krishna T. Malladi, and Yang-Seok Ki. 2022. Enabling CXL Memory Expansion for In-Memory Database Management Systems. In International Conference on Management of Data, DaMoN 2022. 8:1--8:5. https://doi.org/10.1145/3533737.3535090
[2]
Moiz Arif, Kevin Assogba, M. Mustafa Rafique, and Sudharshan Vazhkudai. 2022. Exploiting CXL-based Memory for Distributed Deep Learning. In Proceedings of the 51st International Conference on Parallel Processing, ICPP 2022, Bordeaux, France, 29 August 2022 - 1 September 2022. ACM, 19:1--19:11. https://doi.org/10.1145/3545008.3545054
[3]
Maximilian Bandle, Jana Giceva, and Thomas Neumann. 2021. To Partition, or Not to Partition, That is the Join Question in a Real System. In Proceedings of the 2021 International Conference on Management of Data (SIGMOD '21). 168--180.
[4]
Claude Barthels, Gustavo Alonso, Torsten Hoefler, Timo Schneider, and Ingo Müller. 2017. Distributed Join Algorithms on Thousands of Cores. Proc. VLDB Endow. 10, 5 (2017), 517--528. https://doi.org/10.14778/3055540.3055545
[5]
Stephan Baumann, Peter A. Boncz, and Kai-Uwe Sattler. 2016. Bitwise dimensional co-clustering for analytical workloads. VLDB J. 25, 3 (2016), 291--316. https://doi.org/10.1007/S00778-015-0417-Y
[6]
Lawrence Benson, Marcel Weisgut, and Tilmann Rabl. 2023. What we can learn from persistent memory for cxl. (2023).
[7]
Peter A Boncz, Marcin Zukowski, and Niels Nes. 2005. MonetDB/X100: Hyper-Pipelining Query Execution. In Cidr, Vol. 5. 225--237.
[8]
Jonathan Cameron. 2024. QEMU. https://gitlab.com/jic23/qemu. Accessed: 2024-03-21.
[9]
CXL Consortium. 2024. Compute Express Link. https://computeexpresslink.org/cxl-specification-landing-page Accessed on 29.02.2024.
[10]
Andrei Costea, Adrian Ionescu, Bogdan Răducanu, Michał Switakowski, Cristian Bârca, Juliusz Sompolski, Alicja Łuszczak, Michał Szffrański, Giel de Nijs, and Peter Boncz. 2016. VectorH: Taking SQL-on-Hadoop to the Next Level. In Proceedings of the 2016 International Conference on Management of Data (San Francisco, California, USA) (SIGMOD '16). Association for Computing Machinery, New York, NY, USA, 1105--1117. https://doi.org/10.1145/2882903.2903742
[11]
Philipp Fent, Alexander van Renen, Andreas Kipf, Viktor Leis, Thomas Neumann, and Alfons Kemper. 2020. Low-Latency Communication for Fast DBMS Using RDMA and Shared Memory. In 36th IEEE International Conference on Data Engineering, ICDE 2020, Dallas, TX, USA, April 20-24, 2020. IEEE, 1477--1488. https://doi.org/10.1109/ICDE48307.2020.00131
[12]
Donghyun Gouk, Sangwon Lee, Miryeong Kwon, and Myoungsoo Jung. 2022. Direct Access, High-Performance Memory Disaggregation with DirectCXL. In 2022 USENIX Annual Technical Conference, USENIX ATC 2022, Carlsbad, CA, USA, July 11-13, 2022. 287--294.
[13]
SK Hynix. 2024. HMSDK: Heterogeneous Memory Software Development Kit. https://github.com/skhynix/hmsdk Accessed on 29.02.2024.
[14]
Steffen Kläbe and Kai-Uwe Sattler. 2023. Patched Multi-Key Partitioning for Robust Query Performance. In Proceedings 26th International Conference on Extending Database Technology, EDBT 2023, Ioannina, Greece, March 28-31, 2023. 324--336. https://doi.org/10.48786/EDBT.2023.26
[15]
Gary D. Knott and Pilar De La Torre. 1989. Hash table collision resolution with direct chaining. J. Algorithms 10, 1 (mar 1989), 20--34. https://doi.org/10.1016/0196-6774(89)90021-7
[16]
Dario Korolija, Dimitrios Koutsoukos, Kimberly Keeton, Konstantin Taranov, Dejan Milojičić, and Gustavo Alonso. 2021. Farview: Disaggregated memory with operator off-loading for database engines. arXiv preprint arXiv:2106.07102 (2021).
[17]
Harald Lang, Viktor Leis, Martina-Cezara Albutiu, Thomas Neumann, and Alfons Kemper. 2015. Massively Parallel NUMA-Aware Hash Joins. In In Memory Data Management and Analysis, Arun Jagatheesan, Justin Levandoski, Thomas Neumann, and Andrew Pavlo (Eds.). Springer International Publishing, Cham, 3--14.
[18]
Alberto Lerner and Gustavo Alonso. 2024. CXL and the Return of Scale-Up Database Engines. arXiv:2401.01150 [cs.DB]
[19]
Jiuxing Liu, Jiesheng Wu, Sushmitha P. Kini, Pete Wyckoff, and Dhabaleswar K. Panda. 2003. High performance RDMA-based MPI implementation over InfiniBand. In Proceedings of the 17th Annual International Conference on Supercomputing (San Francisco, CA, USA) (ICS '03). 295--304.
[20]
Teng Ma, Kang Chen, Shaonan Ma, Zhuo Song, and Yongwei Wu. 2021. Thinking More about RDMA Memory Semantics. In 2021 IEEE International Conference on Cluster Computing (CLUSTER). 456--467. https://doi.org/10.1109/Cluster48925.2021.00033
[21]
Gabriele Paoloni. 2010. How to benchmark code execution times on Intel IA-32 and IA-64 instruction set architectures. Intel Corporation 123, 170 (2010).
[22]
S. J. Park, H. Kim, K.-S. Kim, J. So, J. Ahn, W.-J. Lee, D. Kim, Young-Ju Kim, J. Seok, J.-G. Lee, H.-Y. Ryu, C. Y. Lee, J. Prout, K.-C. Ryoo, S.-J. Han, M.-K. Kook, J. S. Choi, J. Gim, Y. S. Ki, S. Ryu, C. Park, D.-G. Lee, J. Cho, H. Song, and J. Y. Lee. 2022. Scaling of Memory Performance and Capacity with CXL Memory Expander. In 2022 IEEE Hot Chips 34 Symposium, HCS 2022, Cupertino, CA, USA, August 21-23, 2022. 1--27. https://doi.org/10.1109/HCS55958.2022.9895633
[23]
Constantin Pohl and Kai-Uwe Sattler. 2018. Joins in a heterogeneous memory hierarchy: exploiting high-bandwidth memory. In Proceedings of the 14th International Workshop on Data Management on New Hardware (Houston, Texas) (DAMON '18). Association for Computing Machinery, New York, NY, USA, Article 8, 10 pages. https://doi.org/10.1145/3211922.3211929
[24]
Magdalena Pröbstl, Philipp Fent, Maximilian E. Schüle, Moritz Sichert, Thomas Neumann, and Alfons Kemper. 2021. One Buffer Manager to Rule Them All: Using Distributed Memory with Cache Coherence over RDMA. In International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, ADMS@VLDB 2021. 17--26.
[25]
Niklas Riekenbrauck, Marcel Weisgut, Daniel Lindner, and Tilmann Rabl. 2024. A Three-Tier Buffer Manager Integrating CXL Device Memory for Database Systems. (2024).
[26]
Samsung. 2024. OpenMPDK/SMDK: Scalable Memory Development Kit. https://github.com/OpenMPDK/SMDK Accessed on 29.02.2024.
[27]
Debendra Das Sharma, Robert Blankenship, and Daniel S. Berger. 2023. An Introduction to the Compute Express Link (CXL) Interconnect. arXiv:2306.11227 [cs.AR]
[28]
Joonseop Sim, Soohong Ahn, Taeyoung Ahn, Seungyong Lee, Myunghyun Rhee, Jooyoung Kim, Kwangsik Shin, Donguk Moon, Euiseok Kim, and Kyoung Park. 2023. Computational CXL-Memory Solution for Accelerating Memory-Intensive Applications. IEEE Comput. Archit. Lett. 22, 1 (2023), 5--8. https://doi.org/10.1109/LCA.2022.3226482
[29]
Juliusz Sompolski, Marcin Zukowski, and Peter Boncz. 2011. Vectorization vs. compilation in query execution (DaMoN '11). 33--40. https://doi.org/10.1145/1995441.1995446
[30]
Daniel Sorin, Mark Hill, and David Wood. 2011. A primer on memory consistency and cache coherence. Morgan & Claypool Publishers.
[31]
Yan Sun, Yifan Yuan, Zeduo Yu, Reese Kuper, Chihun Song, Jinghan Huang, Houxiang Ji, Siddharth Agarwal, Jiaqi Lou, Ipoom Jeong, Ren Wang, Jung Ho Ahn, Tianyin Xu, and Nam Sung Kim. 2023. Demystifying CXL Memory with Genuine CXL-Ready Systems and Devices. In Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023. 105--121. https://doi.org/10.1145/3613424.3614256
[32]
Junjay Tan, Thanaa Ghanem, Matthew Perron, Xiangyao Yu, Michael Stonebraker, David DeWitt, Marco Serafini, Ashraf Aboulnaga, and Tim Kraska. 2019. Choosing a cloud DBMS: architectures and tradeoffs. Proc. VLDB Endow. 12, 12 (aug 2019), 2170--2182. https://doi.org/10.14778/3352063.3352133
[33]
Chenjiu Wang, Ke He, Ruiqi Fan, Xiaonan Wang, Wei Wang, and Qinfen Hao. 2023. CXL over Ethernet: A Novel FPGA-based Memory Disaggregation Design in Data Centers. In 31st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, FCCM 2023, Marina Del Rey, CA, USA, May 8-11, 2023. IEEE, 75--82. https://doi.org/10.1109/FCCM57271.2023.00017
[34]
Qing Wang, Youyou Lu, and Jiwu Shu. 2022. Sherman: A Write-Optimized Distributed B+Tree Index on Disaggregated Memory (SIGMOD '22). Association for Computing Machinery, New York, NY, USA, 1033--1048. https://doi.org/10.1145/3514221.3517824
[35]
Zhonghua Wang, Yixing Guo, Kai Lu, Jiguang Wan, Daohui Wang, Ting Yao, and Huatao Wu. 2024. Rcmp: Reconstructing RDMA-Based Memory Disaggregation via CXL. ACM Trans. Archit. Code Optim. 21, 1, Article 15 (jan 2024), 26 pages. https://doi.org/10.1145/3634916
[36]
Xingda Wei, Haotian Wang, Tianxia Wang, Rong Chen, Jinyu Gu, Pengfei Zuo, and Haibo Chen. 2023. Transactional Indexes on (RDMA or CXL-based) Disaggregated Memory with Repairable Transaction. CoRR abs/2308.02501 (2023). https://doi.org/10.48550/ARXIV.2308.02501 arXiv:2308.02501
[37]
Yiwei Yang, Pooneh Safayenikoo, Jiacheng Ma, Tanvir Ahmed Khan, and Andrew Quinn. 2023. CXLMemSim: A pure software simulated CXL. mem for performance characterization. arXiv preprint arXiv:2303.06153 (2023).
[38]
Mingxing Zhang, Teng Ma, Jinqi Hua, Zheng Liu, Kang Chen, Ning Ding, Fan Du, Jinlei Jiang, Tao Ma, and Yongwei Wu. 2023. Partial Failure Resilient Memory Management System for (CXL-based) Distributed Shared Memory. In Proceedings of the 29th Symposium on Operating Systems Principles, SOSP 2023. 658--674. https://doi.org/10.1145/3600006.3613135
[39]
Tobias Ziegler, Carsten Binnig, and Viktor Leis. 2022. ScaleStore: A Fast and Cost-Efficient Storage Engine using DRAM, NVMe, and RDMA. In Proceedings of the 2022 International Conference on Management of Data (SIGMOD '22). 685--699.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
DaMoN '24: Proceedings of the 20th International Workshop on Data Management on New Hardware
June 2024
123 pages
ISBN:9798400706677
DOI:10.1145/3662010
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2024

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Actian Corp.

Conference

SIGMOD/PODS '24
Sponsor:

Acceptance Rates

DaMoN '24 Paper Acceptance Rate 14 of 25 submissions, 56%;
Overall Acceptance Rate 94 of 127 submissions, 74%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 354
    Total Downloads
  • Downloads (Last 12 months)354
  • Downloads (Last 6 weeks)117
Reflects downloads up to 30 Sep 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media