Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3674399.3674418acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesacm-turcConference Proceedingsconference-collections
research-article
Open access

Rethinking Hash Tables: Challenges and Opportunities with Compute Express Link (CXL)

Published: 30 July 2024 Publication History

Abstract

Hash tables can efficiently determine whether an element exists in a given set and have been widely used in computer networks, the Internet of Things (IoT), data centers, and stream data mining. With the continuous generation of massive data, the memory consumption of hash tables keeps increasing. The emerging Compute Express Link (CXL) technique can significantly expand memory capacity. Porting hash tables from DRAM to CXL memory can alleviate the issue that hash tables occupy significant amounts of DRAM space. However, porting hash tables to CXL memory is not a trivial task. This paper analyzes the challenges of porting hash tables to CXL memory and shows opportunities to address these challenges.

References

[1]
2024. AWS Lambda. https://aws.amazon.com/lambda/ Accessed: 2024-06-06.
[2]
2024. Azure Functions. https://azure.microsoft.com/en-us/products/functions Accessed: 2024-06-06.
[3]
2024. CXL-based memory | Micron Technology Inc.https://sg.micron.com/products/memory/cxl-memory Accessed: 2024-06-09.
[4]
2024. Expanding the Limits of Memory Bandwidth and Density: Samsung’s CXL Memory Expander. https://semiconductor.samsung.com/news-events/tech-blog/expanding-the-limits-of-memory-bandwidth-and-density-samsungs-cxl-dram-memory-expander/ Accessed: 2024-06-09.
[5]
2024. Partitioned Index/Filters. https://rocksdb.org/blog/2017/05/12/partitioned-index-filter.html Accessed: 2024-06-09.
[6]
Michael A. Bender, Martin Farach-Colton, Rob Johnson, Russell Kraner, Bradley C. Kuszmaul, Dzejla Medjedovic, Pablo Montes, Pradeep Shetty, Richard P. Spillane, and Erez Zadok. 2012. Don’t Thrash: How to Cache Your Hash on Flash. Proceedings of the VLDB Endowment 5, 11 (2012), 1627–1637.
[7]
Burton H. Bloom. 1970. Space/Time Trade-offs in Hash Coding with Allowable Errors. Commun. ACM 13, 7 (1970), 422–426.
[8]
Sebastian Burckhardt, Badrish Chandramouli, Chris Gillum, David Justo, Konstantinos Kallas, Connor McMahon, Christopher Meiklejohn, and Xiangfeng Zhu. 2022. Netherite: Efficient Execution of Serverless Workflows. Proceedings of the VLDB Endowment 15, 8 (2022), 1591–1604.
[9]
Qi Chen, Hao Hu, Cai Deng, Dingbang Liu, Shiyi Li, Bo Tang, Ting Yao, and Wen Xia. 2023. EEPH: An Efficient Extendible Perfect Hashing for Hybrid PMem-DRAM. In Proceedings of the International Conference on Data Engineering. IEEE, 1366–1378.
[10]
Zhiwen Chen, Daokun Hu, Wenkui Che, Jianhua Sun, and Hao Chen. 2024. A quantitative evaluation of persistent memory hash indexes. The VLDB Journal 33, 2 (2024), 375–397.
[11]
Zhangyu Chen, Yu Hua, Bo Ding, and Pengfei Zuo. 2020. Lock-free Concurrent Level Hashing for Persistent Memory. In Proceedings of the Annual Technical Conference. USENIX Association, 799–812.
[12]
Haipeng Dai, Muhammad Shahzad, Alex X. Liu, and Yuankun Zhong. 2016. Finding Persistent Items in Data Streams. Proceedings of the VLDB Endowment 10, 4 (2016), 289–300.
[13]
Haipeng Dai, Hancheng Wang, Zhipeng Chen, Jiaqi Zheng, Meng Li, Rong Gu, Chen Tian, and Wanchun Dou. 2023. Variable-length Encoding Framework: A Generic Framework for Enhancing the Accuracy of Approximate Membership Queries. In Proceedings of International Conference on Data Mining. IEEE, 61–70.
[14]
Biplob Debnath, Alireza Haghdoost, Asim Kadav, Mohammed G. Khatib, and Cristian Ungureanu. 2015. Revisiting hash table design for phase change memory. In Proceedings of the Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads. ACM, 1–9.
[15]
Bin Fan, David G. Andersen, Michael Kaminsky, and Michael Mitzenmacher. 2014. Cuckoo Filter: Practically Better Than Bloom. In Proceedings of ACM International Conference on Emerging Networking Experiments and Technologies. ACM, 75–88.
[16]
Li Fan, Pei Cao, Jussara M. Almeida, and Andrei Z. Broder. 2000. Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol. IEEE/ACM Transactions on Networking 8, 3 (2000), 281–293.
[17]
Rong Gu, Simian Li, Haipeng Dai, Hancheng Wang, Yili Luo, Bin Fan, Ran Ben Basat, Ke Wang, Zhenyu Song, Shouwei Chen, Beinan Wang, Yihua Huang, and Guihai Chen. 2023. Adaptive Online Cache Capacity Optimization via Lightweight Working Set Size Estimation at Scale. In Proceedings of Annual Technical Conference. USENIX, 467–484.
[18]
Gaurav Gupta, Minghao Yan, Benjamin Coleman, Bryce Kille, Ryan A. Leo Elworth, Tharun Medini, Todd J. Treangen, and Anshumali Shrivastava. 2021. Fast Processing and Querying of 170TB of Genomics Data via a Repeated And Merged BloOm Filter (RAMBO). In Proceedings of International Conference on Management of Data. ACM, 2226–2234.
[19]
Maurice Herlihy, Nir Shavit, and Moran Tzafrir. 2008. Hopscotch Hashing. In Proceedings of International Symposium on Distributed Computing. Springer, 350–364.
[20]
Daokun Hu, Zhiwen Chen, Wenkui Che, Jianhua Sun, and Hao Chen. 2022. Halo: A Hybrid PMem-DRAM Persistent Hash Index with Fast Recovery. In Proceedings of International Conference on Management of Data. ACM, 1049–1063.
[21]
Daokun Hu, Zhiwen Chen, Jianbing Wu, Jianhua Sun, and Hao Chen. 2021. Persistent Memory Hash Indexes: An Experimental Evaluation. Proceedings of the VLDB Endowment 14, 5 (2021), 785–798.
[22]
Kaisong Huang, Yuliang He, and Tianzheng Wang. 2022. The Past, Present and Future of Indexing on Persistent Memory. Proceedings of the VLDB Endowment 15, 12 (2022), 3774–3777.
[23]
Robert Kelly, Barak A. Pearlmutter, and Phil Maguire. 2020. Lock-Free Hopscotch Hashing. In Proceedings of Symposium on Algorithmic Principles of Computer Systems. SIAM, 45–59.
[24]
Se Kwon Lee, Jayashree Mohan, Sanidhya Kashyap, Taesoo Kim, and Vijay Chidambaram. 2019. Recipe: converting concurrent DRAM indexes to persistent-memory indexes. In Proceedings of the Symposium on Operating Systems Principles. ACM, 462–477.
[25]
Sylvain Lefebvre. 2013. Indexed Bloom Filters for Web Caches Summaries. In Proceedings of International Conference on Computational Collective Intelligence. Springer, 507–516.
[26]
Tianlong Li, Tian Song, and Yating Yang. 2024. iStack: A General and Stateful Name-based Protocol Stack for Named Data Networking. In Proceedings of Symposium on Networked Systems Design and Implementation. USENIX, 267–280.
[27]
Yunchuan Li, Ziwei Wang, Ruixin Yang, Yan Zhao, Rui Zhou, and Kai Zheng. 2023. Learned Bloom Filter for Multi-key Membership Testing. In Proceedings of Database Systems for Advanced Applications. Springer, 62–79.
[28]
Witold Litwin. 1980. Linear Hashing: A New Tool for File and Table Addressing. In Proceedings of International Conference on Very Large Data Bases. ACM, 212–223.
[29]
Jiaqian Liu, Haipeng Dai, Rui Xia, Meng Li, Ran Ben Basat, Rui Li, and Guihai Chen. 2022. DUET: A Generic Framework for Finding Special Quadratic Elements in Data Streams. In Proceedings of International World Wide Web Conference. ACM, 2989–2997.
[30]
Shizhe Liu, Haipeng Dai, Shaoxu Song, Meng Li, Jingsong Dai, Rong Gu, and Guihai Chen. 2024. ACER: Accelerating Complex Event Recognition via Two-Phase Filtering under Range Bitmap-Based Indexes. In Proceedings of International Conference on Knowledge Discovery and Data Mining. ACM, 1–12.
[31]
Zhuoxuan Liu and Shimin Chen. 2023. Pea Hash: A Performant Extendible Adaptive Hashing Index. Proceedings of the ACM on Management of Data 1, 1 (2023), 1–25.
[32]
Baotong Lu, Xiangpeng Hao, Tianzheng Wang, and Eric Lo. 2020. Dash: Scalable Hashing on Persistent Memory. Proceedings of the VLDB Endowment 13, 8 (2020), 1147–1161.
[33]
Hasan Al Maruf and Mosharaf Chowdhury. 2023. Memory Disaggregation: Advances and Open Challenges. ACM SIGOPS Operating Systems Review 57, 1 (2023), 29–37.
[34]
Hasan Al Maruf, Hao Wang, Abhishek Dhanotia, Johannes Weiner, Niket Agarwal, Pallab Bhattacharya, Chris Petersen, Mosharaf Chowdhury, Shobhit O. Kanaujia, and Prakash Chauhan. 2023. TPP: Transparent Page Placement for CXL-Enabled Tiered-Memory. In Proceedings of International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 742–755.
[35]
Hunter McCoy, Steven A. Hofmeyr, Katherine A. Yelick, and Prashant Pandey. 2023. High-Performance Filters for GPUs. In Proceedings of Annual Symposium on Principles and Practice of Parallel Programming. ACM, 160–173.
[36]
Moohyeon Nam, Hokeun Cha, Young-ri Choi, Sam H. Noh, and Beomseok Nam. 2019. Write-Optimized Dynamic Hashing for Persistent Memory. In Proceedings of the Conference on File and Storage Technologies. USENIX Association, 31–44.
[37]
Prashant Pandey, Michael A. Bender, Alex Conway, Martin Farach-Colton, William Kuszmaul, Guido Tagliavini, and Rob Johnson. 2023. IcebergHT: High Performance Hash Tables Through Stability and Low Associativity. Proceedings of the ACM on Management of Data 1, 1 (2023), 1–26.
[38]
Sylvia Ratnasamy, Andrey Ermolinskiy, and Scott Shenker. 2006. Revisiting IP Multicast. In Proceedings of Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications. ACM, 15–26.
[39]
Semih Salihoglu, Wenchao Zhou, Rada Chirkova, Jun Yang, and Dan Suciu. 2017. Monkey: Optimal Navigable Key-Value Store. In Proceedings of International Conference on Management of Data. ACM, 79–94.
[40]
David Schwalb, Markus Dreseler, Matthias Uflacker, and Hasso Plattner. 2015. NVC-Hashmap: A Persistent and Concurrent Hashmap For Non-Volatile Memories. In Proceedings of the VLDB Workshop on In-Memory Data Mangement and Analytics. ACM, 1–8.
[41]
Yupeng Tang, Ping Zhou, Wenhui Zhang, Henry Hu, Qirui Yang, Hao Xiang, Tongping Liu, Jiaxin Shan, Ruoyun Huang, Cheng Zhao, Cheng Chen, Hui Zhang, Fei Liu, Shuai Zhang, Xiaoning Ding, and Jianjun Chen. 2024. Exploring Performance and Cost Optimization with ASIC-Based CXL Memory. In Proceedings of the Nineteenth European Conference on Computer Systems. ACM, 818–833.
[42]
Lukas Vogel, Alexander van Renen, Satoshi Imamura, Jana Giceva, Thomas Neumann, and Alfons Kemper. 2022. Plush: A Write-Optimized Persistent Log-Structured Hash-Table. Proceedings of the VLDB Endowment 15, 11 (2022), 2895–2907.
[43]
Chao Wang, Junliang Hu, Tsun-Yu Yang, Yuhong Liang, and Ming-Chang Yang. 2023. SEPH: Scalable, Efficient, and Predictable Hashing on Persistent Memory. In Proceedings of the Symposium on Operating Systems Design and Implementation. USENIX Association, 479–495.
[44]
Hancheng Wang, Haipeng Dai, Shusen Chen, Meng Li, Rong Gu, Huayi Chai, Jiaqi Zheng, Zhiyuan Chen, Shuaituan Li, Xianjun Deng, and Guihai Chen. 2024. Bamboo Filters: Make Resizing Smooth and Adaptive. IEEE/ACM Transactions on Networking 32, 1 (2024), 1–16. https://doi.org/10.1109/TNET.2024.3403997
[45]
Hancheng Wang, Haipeng Dai, Rong Gu, Youyou Lu, Jiaqi Zheng, Jingsong Dai, Shusen Chen, Zhiyuan Chen, Shuaituan Li, and Guihai Chen. 2024. Wormhole Filters: Caching Your Hash on Persistent Memory. In Proceedings of European Conference on Computer Systems. ACM, 1–16.
[46]
Hancheng Wang, Haipeng Dai, Meng Li, Jun Yu, Rong Gu, Jiaqi Zheng, and Guihai Chen. 2022. Bamboo Filters: Make Resizing Smooth. In Proceedings of IEEE International Conference on Data Engineering. IEEE, 979–991.
[47]
Hao Zheng, Chen Tian, Tong Yang, Huiping Lin, Chang Liu, Zhaochen Zhang, Wanchun Dou, and Guihai Chen. 2022. Flymon: Enabling On-The-Fly Task Reconfiguration for Network Measurement. In Proceedings of Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications. ACM, 486–502.
[48]
Pengfei Zuo and Yu Hua. 2018. A Write-Friendly and Cache-Optimized Hashing Scheme for Non-Volatile Memory Systems. IEEE Transactions on Parallel and Distributed Systems 29, 5 (2018), 985–998.
[49]
Pengfei Zuo, Yu Hua, and Jie Wu. 2018. Write-Optimized and High-Performance Hashing Index Scheme for Persistent Memory. In Proceedings of the Symposium on Operating Systems Design and Implementation. USENIX Association, 461–476.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ACM-TURC '24: Proceedings of the ACM Turing Award Celebration Conference - China 2024
July 2024
261 pages
ISBN:9798400710117
DOI:10.1145/3674399
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 July 2024

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing University
  • Jiangsu High-level Innovation and Entrepreneurship (Shuangchuang) Program
  • National Natural Science Foundation of China

Conference

ACM-TURC '24

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 269
    Total Downloads
  • Downloads (Last 12 months)269
  • Downloads (Last 6 weeks)88
Reflects downloads up to 29 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media