Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2882903.2915222acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Public Access

Reducing the Storage Overhead of Main-Memory OLTP Databases with Hybrid Indexes

Published: 26 June 2016 Publication History

Abstract

Using indexes for query execution is crucial for achieving high performance in modern on-line transaction processing databases. For a main-memory database, however, these indexes consume a large fraction of the total memory available and are thus a major source of storage overhead of in-memory databases. To reduce this overhead, we propose using a two-stage index: The first stage ingests all incoming entries and is kept small for fast read and write operations. The index periodically migrates entries from the first stage to the second, which uses a more compact, read-optimized data structure. Our first contribution is hybrid index, a dual-stage index architecture that achieves both space efficiency and high performance. Our second contribution is Dual-Stage Transformation (DST), a set of guidelines for converting any order-preserving index structure into a hybrid index. Our third contribution is applying DST to four popular order-preserving index structures and evaluating them in both standalone microbenchmarks and a full in-memory DBMS using several transaction processing workloads. Our results show that hybrid indexes provide comparable throughput to the original ones while reducing the memory overhead by up to 70%.

References

[1]
ALTIBASE Index Documentation. http://aid.altibase.com/display/migfromora/Index.
[2]
Cache-Optimized Concurrent Skip list. http://sourceforge.net/projects/skiplist/files/Templatized%20C%2B%2B%20Version/.
[3]
H-Store. http://hstore.cs.brown.edu.
[4]
Leveldb. http://www.leveldb.org.
[5]
LZ4. https://code.google.com/p/lz4/.
[6]
MemSQL Documentation. http://docs.memsql.com/latest/concepts/indexes/.
[7]
MySQL Memory Storage Engine. http://dev.mysql.com/doc/refman/5.7/en/memory-storage-engine.html.
[8]
Mysql v5.5 -- how compression works for innodb tables. http://dev.mysql.com/doc/refman/5.5/en/innodb-compression-internals.html.
[9]
Overview of TimesTen Index Types. https://docs.oracle.com/cd/E21901_01/timesten.1122/e21633/comp.htm#TTOPR380.
[10]
Performance Application Programming Interface. http://icl.cs.utk.edu/papi/index.html.
[11]
Reddit. http://www.reddit.com.
[12]
Redis Index. http://redis.io/topics/indexes.
[13]
SAP Hana Indexes. http://saphanawiki.com/2015/09/2160391-faq-sap-hana-indexes/.
[14]
Snappy. https://github.com/google/snappy.
[15]
SQLite Documentation. https://www.sqlite.org/docs.html.
[16]
VoltDB Blog. https://voltdb.com/blog/best-practices-index-optimization-voltdb.
[17]
WiredTiger. http://wiredtiger.com.
[18]
J. L. Bentley and J. B. Saxe. Decomposable searching problems I: static-to-dynamic transformation. J. Algorithms, 1(4):301--358, 1980.
[19]
T. Bingmann. STX B+ tree C+ template classes. http://panthema.net/2007/stx-btree/.
[20]
C.-Y. Chan and Y. E. Ioannidis. Bitmap index design and evaluation. SIGMOD, pages 355--366, 1998.
[21]
D. Comer. Ubiquitous b-tree. ACM Comput. Surv., 11(2):121--137, June 1979.
[22]
B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking cloud serving systems with YCSB. In SoCC, pages 143--154, 2010.
[23]
J. DeBrabant, A. Pavlo, S. Tu, M. Stonebraker, and S. Zdonik. Anti-caching: A new approach to database management system architecture. VLDB, 6(14):1942--1953, Sept. 2013.
[24]
D. J. DeWitt, R. H. Katz, F. Olken, L. D. Shapiro, M. R. Stonebraker, and D. Wood. Implementation techniques for main memory database systems. SIGMOD Rec., 14(2):1--8, 1984.
[25]
C. Diaconu, C. Freedman, E. Ismert, P.-A. Larson, P. Mittal, R. Stonecipher, N. Verma, and M. Zwilling. Hekaton: SQL Server's Memory-Optimized OLTP Engine. In SIGMOD, pages 1--12, 2013.
[26]
J. Dittrich and A. Jindal. Towards a one size fits all database architecture. In CIDR, pages 195--198, 2011.
[27]
A. Eldawy, J. J. Levandoski, and P. Larson. Trekking through siberia: Managing cold data in a memory-optimized database. PVLDB, 7(11):931--942, 2014.
[28]
F. Farber, S. K. Cha, J. Primsch, C. Bornhövd, S. Sigg, and W. Lehner. Sap hana database: data management for modern business applications. SIGMOD Rec., 40(4):45--51, Jan. 2012.
[29]
F. Funke, A. Kemper, and T. Neumann. Compacting transactional data in hybrid oltp&olap databases. VLDB, 5(11):1424--1435, July 2012.
[30]
S. Harizopoulos, D. J. Abadi, S. Madden, and M. Stonebraker. OLTP through the looking glass, and what we found there. In SIGMOD, pages 981--992, 2008.
[31]
S. Idreos, M. L. Kersten, and S. Manegold. Database cracking. In CIDR, pages 68--78, 2007.
[32]
G. J. Jacobson. Succinct static data structures. 1980.
[33]
R. Kallman, H. Kimura, J. Natkins, A. Pavlo, A. Rasin, S. Zdonik, E. P. C. Jones, S. Madden, M. Stonebraker, Y. Zhang, J. Hugg, and D. J. Abadi. H-Store: A High-Performance, Distributed Main Memory Transaction Processing System. VLDB, 1(2):1496--1499, 2008.
[34]
J. Krueger, C. Kim, M. Grund, N. Satish, D. Schwalb, J. Chhugani, H. Plattner, P. Dubey, and A. Zeier. Fast updates on read-optimized databases using multi-core cpus. VLDB, 5(1):61--72, Sept. 2011.
[35]
P.-A. Larson, C. Clinciu, E. N. Hanson, A. Oks, S. L. Price, S. Rangarajan, A. Surna, and Q. Zhou. Sql server column store indexes. SIGMOD, pages 1177--1184, 2011.
[36]
C. Lefurgy, K. Rajamani, F. Rawson, W. Felter, M. Kistler, and T. W. Keller. Energy management for commercial servers. Computer, 36(12).
[37]
V. Leis, A. Kemper, and T. Neumann. The adaptive radix tree: Artful indexing for main-memory databases. ICDE, pages 38--49, 2013.
[38]
J. Levandoski, D. Lomet, S. Sengupta, A. Birka, and C. Diaconu. Indexing on modern hardware: Hekaton and beyond. SIGMOD, pages 717--720, 2014.
[39]
H. Lim, B. Fan, D. G. Andersen, and M. Kaminsky. SILT: A memory-efficient, high-performance key-value store. SOSP '11, pages 1--13, 2011.
[40]
M. Mammarella, S. Hovsepian, and E. Kohler. Modular data storage with anvil. SOSP, pages 147--160, 2009.
[41]
Y. Mao, E. Kohler, and R. T. Morris. Cache craftiness for fast multicore key-value storage. EuroSys, pages 183--196, 2012.
[42]
J. I. Munro, T. Papadakis, and R. Sedgewick. Deterministic skip lists. the third annual ACM-SIAM symposium on Discrete algorithms, 1992.
[43]
P. Muth, P. O'Neil, A. Pick, and G. Weikum. The lham log-structured history data access method. The VLDB Journal, 8(3--4):199--221, Feb. 2000.
[44]
P. O'Neil, E. Cheng, D. Gawlick, and E. O'Neil. The log-structured merge-tree (lsm-tree). Acta Inf., 33(4):351--385, June 1996.
[45]
A. Prout. The story behind memsql's skiplist indexes. http://blog.memsql.com/the-story-behind-memsqls-skipłist-indexes/.
[46]
W. Pugh. Skip lists: A probabilistic alternative to balanced trees. Communications of the ACM, 33(6):668--676, 1990.
[47]
V. Sikka, F. Farber, W. Lehner, S. K. Cha, T. Peh, and C. Bornhövd. Efficient transaction processing in sap hana database: The end of a column store myth. SIGMOD, pages 731--742, 2012.
[48]
R. Stoica and A. Ailamaki. Enabling efficient os paging for main-memory oltp databases. DaMoN, 2013.
[49]
R. Stoica, J. J. Levandoski, and P.-A. Larson. Identifying hot and cold data in main-memory databases. ICDE, pages 26--37, 2013.
[50]
M. Stonebraker, S. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, and P. Helland. The end of an architectural era: (it's time for a complete rewrite). In VLDB, pages 1150--1160, 2007.
[51]
The Transaction Processing Council. TPC-C Benchmark (Revision 5.9.0). http://www.tpc.org/tpcc/, June 2007.
[52]
S. Tu, W. Zheng, E. Kohler, B. Liskov, and S. Madden. Speedy transactions in multicore in-memory databases. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP '13, pages 18--32, 2013.
[53]
T. Westmann, D. Kossmann, S. Helmer, and G. Moerkotte. The implementation and performance of compressed databases. SIGMOD Rec., 29(3):55--67, Sept. 2000.
[54]
A. C.-C. Yao. On random 2--3 trees. Acta Informatica, 9:159--170, 1978.
[55]
J. Zhou, P.-A. Larson, J. Goldstein, and L. Ding. Dynamic materialized views. ICDE, pages 526--535, 2007.

Cited By

View all
  • (2025)VEGA: An Active-tuning Learned Index with Group-Wise Learning GranularityProceedings of the ACM on Management of Data10.1145/37097363:1(1-26)Online publication date: 11-Feb-2025
  • (2025)Two-level massive string dictionariesInformation Systems10.1016/j.is.2024.102490128:COnline publication date: 1-Feb-2025
  • (2024)GLMI: An Efficient Spatiotemporal Index Leveraging Geohash and Piecewise Linear Models for Optimized Query PerformanceAlgorithms10.3390/a1711047417:11(474)Online publication date: 22-Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data
June 2016
2300 pages
ISBN:9781450335317
DOI:10.1145/2882903
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. hybrid index
  2. in-memory OLTP database

Qualifiers

  • Research-article

Funding Sources

  • NSF

Conference

SIGMOD/PODS'16
Sponsor:
SIGMOD/PODS'16: International Conference on Management of Data
June 26 - July 1, 2016
California, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)274
  • Downloads (Last 6 weeks)42
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)VEGA: An Active-tuning Learned Index with Group-Wise Learning GranularityProceedings of the ACM on Management of Data10.1145/37097363:1(1-26)Online publication date: 11-Feb-2025
  • (2025)Two-level massive string dictionariesInformation Systems10.1016/j.is.2024.102490128:COnline publication date: 1-Feb-2025
  • (2024)GLMI: An Efficient Spatiotemporal Index Leveraging Geohash and Piecewise Linear Models for Optimized Query PerformanceAlgorithms10.3390/a1711047417:11(474)Online publication date: 22-Oct-2024
  • (2024)LeanStore: A High-Performance Storage Engine for NVMe SSDsProceedings of the VLDB Endowment10.14778/3685800.368591517:12(4536-4545)Online publication date: 8-Nov-2024
  • (2024)SepHash: A Write-Optimized Hash Index On Disaggregated Memory via Separate Segment StructureProceedings of the VLDB Endowment10.14778/3641204.364121817:5(1091-1104)Online publication date: 1-Jan-2024
  • (2024)Making In-Memory Learned Indexes Efficient on DiskProceedings of the ACM on Management of Data10.1145/36549542:3(1-26)Online publication date: 30-May-2024
  • (2024)Trinity: A Fast Compressed Multi-attribute Data StoreProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650072(405-420)Online publication date: 22-Apr-2024
  • (2024)CPMA: An Efficient Batch-Parallel Compressed Set Without PointersProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638492(348-363)Online publication date: 2-Mar-2024
  • (2024)A Fast Learned Key-Value Store for Concurrent and Distributed SystemsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3327009(1-14)Online publication date: 2024
  • (2024)Search-in-Memory: Reliable, Versatile, and Efficient Data Matching in SSD’s NAND Flash Memory Chip for Data Indexing AccelerationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.344370243:11(3864-3875)Online publication date: Nov-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media