Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access
Just Accepted

SPIRIT: Scalable and Persistent In-Memory Indices for Real-Time Search

Online AM: 04 November 2024 Publication History

Abstract

Today, real-time search over big microblogging data requires low indexing and query latency. Online services, therefore, prefer to host inverted indices in memory. Unfortunately, as datasets grow, indices grow proportionally, and with limited DRAM scaling, the main memory faces high pressure. Also, indices must be persisted on disks as building them is computationally intensive. Consequently, it becomes necessary to frequently move on-heap index segments to storage, slowing down indexing. Reading storage-resident index segments requires filesystem calls and disk accesses during query evaluation, leading to high and unpredictable tail latency.
This work exploits hybrid DRAM and scalable non-volatile memory (NVM) to offer dynamically growing and instantly searchable (i.e., real-time) persistent indices in on-heap memory. We implement SPIRIT, a real-time text inversion engine over hybrid memory. SPIRIT exploits the byte-addressability of hybrid memory to enable direct access to the index on a pre-allocated heap, eliminating expensive block storage accesses and filesystem calls during live operation. It uses an in-memory segment descriptor table to offer: ① instant segment availability to query evaluators upon fresh ingestion, ② low-overhead segment movement across memory tiers transparent to query evaluators, and ③ decoupled segment movement into NVM from their visibility to query evaluators, enabling different policies for mitigating NVM latency. SPIRIT accelerates compaction with zero-copy merging. It supports volatile, graceful shutdown, and crash-consistent indexing modes. The latter two modes offer instant recovery using persistent pointers.
SPIRIT with hybrid memory and strong crash consistency guarantees exhibits many orders of magnitude better tail response times and query throughout than the state-of-the-art Lucene search engine. Compared against a highly optimized non-real-time evaluation of Lucene with liberal DRAM size, on average, across six query workloads, SPIRIT still delivers 2.5 × better (real-time) query throughput. Our work applies to other services that will benefit from direct on-heap access to large persistent indices.

References

[1]
S. Aggarwal, H. Almasi, M. DeHerrera, B. Hughes, S. Ikegawa, J. Janesky, H. K. Lee, H. Lu, F. B. Mancoff, K. Nagel, G. Shimon, J. J. Sun, T. Andre, and S. M. Alam. 2019. Demonstration of a Reliable 1 Gb Standalone Spin-Transfer Torque MRAM For Industrial Applications. In IEEE International Electron Devices Meeting (IEDM). https://doi.org/10.1109/IEDM19573.2019.8993516
[2]
Shoaib Akram. 2021. Exploiting Intel Optane Persistent Memory for Full Text Search. In ACM SIGPLAN International Symposium on Memory Management (ISMM). https://doi.org/10.1145/3459898.3463906
[3]
Shoaib Akram. 2021. Performance Evaluation of Intel Optane Memory for Managed Workloads. ACM Trans. Archit. Code Optim. 18, 3, Article 29(Apr 2021). https://doi.org/10.1145/3451342
[4]
Hao Chen, Chaoyi Ruan, Cheng Li, Xiaosong Ma, and Yinlong Xu. 2021. SpanDB: A Fast, Cost-Effective LSM-tree Based KV Store on Hybrid Storage. In 19th USENIX Conference on File and Storage Technologies (FAST). https://www.usenix.org/conference/fast21/presentation/chen-hao
[5]
Youmin Chen, Youyou Lu, Fan Yang, Qing Wang, Yang Wang, and Jiwu Shu. 2020. FlatStore: An Efficient Log-Structured Key-Value Storage Engine for Persistent Memory. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). https://doi.org/10.1145/3373376.3378515
[6]
Aditya Chilukuri and Shoaib Akram. 2023. Analyzing and Improving the Scalability of In-Memory Indices for Managed Search Engines. In ACM SIGPLAN International Symposium on Memory Management (ISMM). https://doi.org/10.1145/3591195.3595272
[7]
Jeremy Condit, Edmund B. Nightingale, Christopher Frost, Engin Ipek, Benjamin Lee, Doug Burger, and Derrick Coetzee. 2009. Better I/O Through Byte-addressable, Persistent Memory. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP). https://doi.org/10.1145/1629575.1629589
[8]
Intel Corporation. 2021. pmemkv. https://github.com/pmem/pmemkv
[9]
Elastic. 2021. Elastic Enterprise Search. https://www.elastic.co/elasticsearch/
[10]
The Apache Software Foundation. 2021. Apache Solr 8.8.2. https://solr.apache.org/
[11]
The Apache Software Foundation. 2021. Welcome to Apache Lucene. https://lucene.apache.org/
[12]
Bill Gervasi. 2019. Will Carbon Nanotube Memory Replace DRAM?IEEE Micro 39, 2 (2019), 45–51. https://doi.org/10.1109/MM.2019.2897560
[13]
Caixin Gong, Chengjin Tian, Zhengheng Wang, Sheng Wang, Xiyu Wang, Qiulei Fu, Wu Qin, Long Qian, Rui Chen, Jiang Qi, Ruo Wang, Guoyun Zhu, Chenghu Yang, Wei Zhang, and Feifei Li. 2022. Tair-PMem: A Fully Durable Non-Volatile Memory Database. Proc. VLDB Endow. 15, 12 (aug 2022), 3346–3358. https://doi.org/10.14778/3554821.3554827
[14]
Siddharth Gupta, Yunho Oh, Lei Yan, Mark Sutherland, Abhishek Bhattacharjee, Babak Falsafi, and Peter Hsu. 2023. AstriFlash A Flash-Based System for Online Services. In IEEE International Symposium on Inverted Files Computer Architecture (HPCA). https://doi.org/10.1109/HPCA56546.2023.10070955
[15]
Daokun Hu, Zhiwen Chen, Jianbing Wu, Jianhua Sun, and Hao Chen. 2021. Persistent memory hash indexes: an experimental evaluation. Proc. VLDB Endow. 14, 5 (Jan. 2021), 785–798. https://doi.org/10.14778/3446095.3446101
[16]
Yihe Huang, Matej Pavlovic, Virendra Marathe, Margo Seltzer, Tim Harris, and Steve Byan. 2018. Closing the Performance Gap Between Volatile and Persistent Key-Value Stores Using Cross-Referencing Logs. In USENIX Annual Technical Conference (ATC). https://www.usenix.org/conference/atc18/presentation/huang
[17]
Yichen Jia and Feng Chen. 2020. From Flash to 3D XPoint: Performance Bottlenecks and Potentials in RocksDB with Storage Evolution. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). https://doi.org/10.1109/ISPASS48437.2020.00034
[18]
Olzhas Kaiyrakhmet, Songyi Lee, Beomseok Nam, Sam H. Noh, and Young ri Choi. 2019. SLM-DB: Single-Level Key-Value Store with Persistent Memory. In USENIX Conference on File and Storage Technologies (FAST). https://www.usenix.org/conference/fast19/presentation/kaiyrakhmet
[19]
Sudarsun Kannan, Nitish Bhat, Ada Gavrilovska, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. 2018. Redesigning LSMs for Nonvolatile Memory with NoveLSM. In USENIX Annual Technical Conference (ATC). https://www.usenix.org/conference/atc18/presentation/kannan
[20]
Hiwot Tadese Kassa, Jason Akers, Mrinmoy Ghosh, Zhichao Cao, Vaibhav Gogte, and Ronald Dreslinski. 2021. Improving Performance of Flash Based Key-Value Stores Using Storage Class Memory as a Volatile Memory Extension. In USENIX Annual Technical Conference (ATC). https://www.usenix.org/conference/atc21/presentation/kassa
[21]
Kinam Kim. 2008. Future memory technology: challenges and opportunities. In 2008 International Symposium on VLSI Technology, Systems and Applications (VLSI-TSA). 5–9. https://doi.org/10.1109/VTSA.2008.4530774
[22]
Martin Kleppmann. 2017. Designing Data-Intensive Applications. O’Reilly, Beijing. https://www.safaribooksonline.com/library/view/designing-data-intensive-applications/9781491903063/
[23]
Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, and Thomas Anderson. 2017. Strata: A Cross Media File System. In Proceedings of the Symposium on Operating Systems Principles (SOSP). https://doi.org/10.1145/3132747.3132770
[24]
Stefan Lai. 2008. Non-volatile memory technologies: The quest for ever lower cost. In 2008 IEEE International Electron Devices Meeting. 1–6. https://doi.org/10.1109/IEDM.2008.4796601
[25]
Sekwon Lee, Soujanya Ponnapalli, Sharad Singhal, Marcos K. Aguilera, Kimberly Keeton, and Vijay Chidambaram. 2022. DINOMO: An Elastic, Scalable, High-Performance Key-Value Store for Disaggregated Persistent Memory. Proc. VLDB Endow. 15, 13 (2022), 4023 – 4037. https://doi.org/10.14778/3565838.3565854
[26]
Baptiste Lepers, Oana Balmau, Karan Gupta, and Willy Zwaenepoel. 2019. KVell: the design and implementation of a fast persistent key-value store. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (Huntsville, Ontario, Canada) (SOSP ’19). 447–461. https://doi.org/10.1145/3341301.3359628
[27]
Kevin Lim, Jichuan Chang, Trevor Mudge, Parthasarathy Ranganathan, Steven K. Reinhardt, and Thomas F. Wenisch. 2009. Disaggregated Memory for Expansion and Sharing in Blade Servers. In International Symposium on Computer Architecture (ISCA). https://doi.org/10.1145/1555754.1555789
[28]
Kevin Lim, Yoshio Turner, Jose Renato Santos, Alvin AuYoung, Jichuan Chang, Parthasarathy Ranganathan, and Thomas F. Wenisch. 2012. System-level implications of disaggregated memory. In IEEE International Symposium on Inverted Files Comp Architecture (HPCA). https://doi.org/10.1109/HPCA.2012.6168955
[29]
Baotong Lu, Xiangpeng Hao, Tianzheng Wang, and Eric Lo. 2020. Dash: Scalable Hashing on Persistent Memory. Proc. VLDB Endow. 13, 8 (apr 2020), 1147–1161. https://doi.org/10.14778/3389133.3389134
[30]
Baotong Lu, Xiangpeng Hao, Tianzheng Wang, and Eric Lo. 2021. Scaling Dynamic Hash Tables on Real Persistent Memory. SIGMOD Rec. 50, 1 (jun 2021), 87–94. https://doi.org/10.1145/3471485.3471506
[31]
Michael McCandless. 201. Visualizing Lucene’s segment merges. https://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
[32]
Michael McCandless. 2021. Luceneutil: Lucene benchmarking utilities. http://blog.mikemccandless.com
[33]
Moohyeon Nam, Hokeun Cha, Young ri Choi, Sam H. Noh, and Beomseok Nam. 2019. Write-Optimized Dynamic Hashing for Persistent Memory. In USENIX Conference on File and Storage Technologies (FAST). https://www.usenix.org/conference/fast19/presentation/nam
[34]
Ismail Oukid, Johan Lasperas, Anisoara Nica, Thomas Willhalm, and Wolfgang Lehner. 2016. FPTree: A Hybrid SCM-DRAM Persistent and Concurrent B-Tree for Storage Class Memory. In Proceedings of the 2016 International Conference on Management of Data (San Francisco, California, USA) (SIGMOD ’16). 371–386. https://doi.org/10.1145/2882903.2915251
[35]
Patrick O’Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O’Neil. 1996. The Log-Structured Merge-Tree (LSM-Tree). Acta Inf. 33, 4 (jun 1996), 351–385. https://doi.org/10.1007/s002360050048
[36]
Twitter. 2023. Tweet Search System (Earlybird). https://github.com/twitter/the-algorithm/blob/main/src/java/com/twitter/search/README.md
[37]
Twitter. 2023. Twitter’s Recommendation Algorithm. https://blog.twitter.com/engineering/en_us/topics/open-source/2023/twitter-recommendation-algorithm
[38]
Lukas Vogel, Alexander van Renen, Satoshi Imamura, Jana Giceva, Thomas Neumann, and Alfons Kemper. 2022. Plush: A Write-Optimized Persistent Log-Structured Hash-Table. Proc. VLDB Endow. 15, 11 (2022).
[39]
Jing Wang, Youyou Lu, Qing Wang, Minhui Xie, Keji Huang, and Jiwu Shu. 2022. Pacman: An Efficient Compaction Approach for Log-Structured Key-Value Store on Persistent Memory. In USENIX Annual Technical Conference (ATC). https://www.usenix.org/conference/atc22/presentation/wang-jing
[40]
Johannes Weiner, Niket Agarwal, Dan Schatzberg, Leon Yang, Hao Wang, Blaise Sanouillet, Bikash Sharma, Tejun Heo, Mayank Jain, Chunqiang Tang, and Dimitrios Skarlatos. 2022. TMO: Transparent Memory Offloading in Datacenters. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’22). 609–621. https://doi.org/10.1145/3503222.3507731
[41]
Matthew Wilcox. 2014. Add Support for NV-DIMMs to Ext4. https://lwn.net/Articles/613384/
[42]
Zhongle Xie, Qingchao Cai, Gang Chen, Rui Mao, and Meihui Zhang. 2018. A Comprehensive Performance Evaluation of Modern In-Memory Indices. In International Conference on Data Engineering (ICDE). https://doi.org/10.1109/ICDE.2018.00064
[43]
Jian Xu, Lu Zhang, Amirsaman Memaripour, Akshatha Gangadharaiah, Amit Borase, Tamires Brito Da Silva, Steven Swanson, and Andy Rudoff. 2017. NOVA-Fortis: A Fault-Tolerant Non-Volatile Main Memory File System. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP). https://doi.org/10.1145/3132747.3132761
[44]
Jian Yang, Juno Kim, Morteza Hoseinzadeh, Joseph Izraelevitz, and Steven Swanson. 2020. An Empirical Guide to the Behavior and Use of Scalable Persistent Memory. In Proceedings of the 14th Usenix Conference on File and Storage Technologies (FAST). https://www.usenix.org/conference/fast20/presentation/yang
[45]
Jun Yang, Qingsong Wei, Cheng Chen, Chundong Wang, Khai Leong Yong, and Bingsheng He. 2015. NV-Tree: Reducing Consistency Cost for NVM-based Single Level Systems. In USENIX Conference on File and Storage Technologies (FAST). https://www.usenix.org/conference/fast15/technical-sessions/presentation/yang
[46]
Ting Yao, Yiwen Zhang, Jiguang Wan, Qiu Cui, Liu Tang, Hong Jiang, Changsheng Xie, and Xubin He. 2020. MatrixKV: Reducing Write Stalls and Write Amplification in LSM-tree Based KV Stores with Matrix Container in NVM. In USENIX Annual Technical Conference (ATC). https://www.usenix.org/conference/atc20/presentation/yao
[47]
Wenhui Zhang, Xingsheng Zhao, Song Jiang, and Hong Jiang. 2021. ChameleonDB: a key-value store for optane persistent memory. In European Conference on Computer Systems (EuroSys). https://doi.org/10.1145/3447786.3456237
[48]
Shengan Zheng, Morteza Hoseinzadeh, and Steven Swanson. 2019. Ziggurat: A Tiered File System for Non-Volatile Main Memories and Disks. In 17th USENIX Conference on File and Storage Technologies (FAST 19). USENIX Association, Boston, MA, 207–219. https://www.usenix.org/conference/fast19/presentation/zheng
[49]
Justin Zobel and Alistair Moffat. 2006. Inverted Files for Text Search Engines. ACM Comput. Surv. 38, 2 (jul 2006), 56 pages. https://doi.org/10.1145/1132956.1132959
[50]
Pengfei Zuo, Yu Hua, and Jie Wu. 2018. Write-Optimized and High-Performance Hashing Index Scheme for Persistent Memory. In USENIX Symposium on Operating Systems Design and Implementation (OSDI). https://www.usenix.org/conference/osdi18/presentation/zuo

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization Just Accepted
EISSN:1544-3973
Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Online AM: 04 November 2024
Accepted: 17 October 2024
Revised: 02 October 2024
Received: 18 February 2024

Check for updates

Author Tags

  1. Real-time search
  2. hybrid memory
  3. non-volatile memory
  4. data-intensive

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 76
    Total Downloads
  • Downloads (Last 12 months)76
  • Downloads (Last 6 weeks)76
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media