Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Design and evaluation of storage organizations for read-optimized main memory databases

Published: 01 August 2013 Publication History

Abstract

Existing main memory data processing systems employ a variety of storage organizations and make a number of storage-related design choices. The focus of this paper is on systematically evaluating a number of these key storage design choices for main memory analytical (i.e. read-optimized) database settings. Our evaluation produces a number of key insights: First, it is always beneficial to organize data into self-contained memory blocks rather than large files. Second, both column-stores and row-stores display performance advantages for different types of queries, and for high performance both should be implemented as options for the tuple-storage layout. Third, cache-sensitive B+-tree indices can play a major role in accelerating query performance, especially when used in a block-oriented organization. Finally, compression can also play a role in accelerating query performance depending on data distribution and query selectivity.

References

[1]
D. J. Abadi, S. R. Madden, and N. Hachem. Column-stores vs. row-stores: how different are they really? SIGMOD, pages 967-980, 2008.
[2]
A. Ailamaki, D. J. DeWitt, and M. D. Hill. Data page layouts for relational databases on deep memory hierarchies. VLDB, pages 198-215, 2002.
[3]
M.-C. Albutiu, A. Kemper, and T. Neumann. Massively parallel sort-merge joins in main memory multi-core database systems. VLDB, pages 1064-1075, 2012.
[4]
C. Balkesen, J. Teubner, G. Alonso, and M. T. Oszu. Main-memory hash joins on multi-core cpus: Tuning to the underlying hardware. ICDE, 2013.
[5]
R. Barber, P. Bendel, M. Czech, O. Draese, F. Ho, N. Hrle, S. Idreos, M.-S. Kim, O. Koeth, J.-G. Lee, T. T. Li, G. M. Lohman, K. Morfonios, R. Müller, K. Murthy, I. Pandis, L. Qiao, V. Raman, R. Sidle, K. Stolze, and S. Szabo. Business analytics in (a) blink. ICDE, pages 9-14, 2012.
[6]
S. Blanas, Y. Li, and J. M. Patel. Design and evaluation of main memory hash join algorithms for multi-core cpus. SIGMOD, pages 37-48, 2011.
[7]
P. A. Boncz, M. L. Kersten, and S. Manegold. Breaking the memory wall in monetdb. Commun. ACM, 51(12):77-85, Dec. 2008.
[8]
C. Chasseur and J. M. Patel. Design and Evaluation of Storage Organizations for Read-Optimized Main Memory Databases (Supplementary Material). http://cs.wisc.edu/quickstep.
[9]
D. J. DeWitt. The wisconsin benchmark: Past, present, and future. The Benchmark Handbook for Database and Transaction Systems. Morgan Kaufmann, 1993.
[10]
C. Diaconu, C. Freedman, E. Ismert, P.-Å. Larson, P. Mittal, R. Stonecipher, N. Verma, and M. Zwilling. Hekaton: Sql server's memory-optimized oltp engine. pages 1243-1254, 2013.
[11]
F. Färber, S. K. Cha, J. Primsch, C. Bornhövd, S. Sigg, and W. Lehner. Sap hana database: data management for modern business applications. SIGMOD, pages 45-51, 2011.
[12]
G. Graefe. Sorting and indexing with partitioned b-trees. CIDR, 2003.
[13]
M. Grund, J. Krüger, H. Plattner, A. Zeier, P. Cudre-Mauroux, and S. Madden. Hyrise: a main memory hybrid storage engine. VLDB, pages 105-116, 2010.
[14]
R. A. Hankins and J. M. Patel. Data morphing: an adaptive, cache-conscious storage technique. VLDB, pages 417-428, 2003.
[15]
S. Harizopoulos, V. Liang, D. J. Abadi, and S. Madden. Performance tradeoffs in read-optimized databases. VLDB, pages 487-498, 2006.
[16]
A. L. Holloway and D. J. DeWitt. Read-optimized databases, in depth. VLDB, pages 502-513, 2008.
[17]
A. Kemper and T. Neumann. Hyper: A hybrid oltp & olap main memory database system based on virtual memory snapshots. ICDE, pages 195-206, 2011.
[18]
C. Kim, T. Kaldewey, V. W. Lee, E. Sedlar, A. D. Nguyen, N. Satish, J. Chhugani, A. Di Blas, and P. Dubey. Sort vs. hash revisited: fast join implementation on modern multi-core cpus. VLDB, pages 1378-1389, 2009.
[19]
T. Lahiri, M.-A. Neimat, and S. Folkman. Oracle timesten: An in-memory database for enterprise applications. IEEE Data Eng. Bull., 36(2):6-13, 2013.
[20]
A. Lamb, M. Fuller, R. Varadarajan, N. Tran, B. Vandiver, L. Doshi, and C. Bear. The vertica analytic database: C-store 7 years later. VLDB, pages 1790-1801, 2012.
[21]
M. Mehta and D. J. DeWitt. Data placement in shared-nothing parallel database systems. VLDB, pages 53-72, 1997.
[22]
B. Murthy, M. Goel, A. Lee, D. Granholm, and S. Cheung. Oracle exalytics in-memory machine: A brief introduction. http://www.oracle.com/us/solutions/ent-performance-bi/business-intelligence/exalytics-bi-machine/overview/exalytics-introduction-1372418.pdf, October 2011.
[23]
D. R. Musser. Introspective sorting and selection algorithms. Software Practice and Experience, 27(8):983-993, 1997.
[24]
A. Nandi, C. Yu, P. Bohannon, and R. Ramakrishnan. Distributed cube materialization on holistic measures. In ICDE, pages 183-194, 2011.
[25]
V. Raman, G. Swart, L. Qiao, F. Reiss, V. Dialani, D. Kossmann, I. Narang, and R. Sidle. Constant-time query processing. ICDE, pages 60-69, 2008.
[26]
J. Rao and K. A. Ross. Making b+- trees cache conscious in main memory. SIGMOD, pages 475-486, 2000.
[27]
J. Rao, C. Zhang, N. Megiddo, and G. Lohman. Automating physical database design in a parallel database. SIGMOD, pages 558-569, 2002.
[28]
K. A. Ross and K. A. Zaman. Serving datacube tuples from main memory. In Scientific and Statistical Database Management, pages 182-195. IEEE, 2000.
[29]
P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database management system. SIGMOD, pages 23-34, 1979.
[30]
V. Sikka, F. Färber, W. Lehner, S. K. Cha, T. Peh, and C. Bornhövd. Efficient transaction processing in sap hana database: the end of a column store myth. SIGMOD, pages 731-742, 2012.
[31]
M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. O'Neil, P. O'Neil, A. Rasin, N. Tran, and S. Zdonik. C-store: a column-oriented dbms. VLDB, pages 553-564, 2005.
[32]
VoltDB Inc. VoltDB Technical Overview. http://voltdb.com/resources/whitepapers, June 2011.
[33]
K. M. Wilson and B. B. Aglietti. Dynamic page placement to improve locality in cc-numa multiprocessors for tpc-c. In ACM/IEEE Conference on Supercomputing, pages 33-33, 2001.
[34]
M. Zukowski, M. van de Wiel, and P. Boncz. Vectorwise: A vectorized analytical dbms. ICDE, pages 1349-1350, 2012.

Cited By

View all
  • (2023)Rethinking the Encoding of Integers for Scans on Skewed DataProceedings of the ACM on Management of Data10.1145/36267511:4(1-27)Online publication date: 12-Dec-2023
  • (2022)On inter-operator data transfers in query processing2022 IEEE 38th International Conference on Data Engineering (ICDE)10.1109/ICDE53745.2022.00066(820-832)Online publication date: May-2022
  • (2021)Making Compiling Query Engines PracticalIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2019.290523533:2(597-612)Online publication date: 1-Feb-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 6, Issue 13
August 2013
180 pages

Publisher

VLDB Endowment

Publication History

Published: 01 August 2013
Published in PVLDB Volume 6, Issue 13

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Rethinking the Encoding of Integers for Scans on Skewed DataProceedings of the ACM on Management of Data10.1145/36267511:4(1-27)Online publication date: 12-Dec-2023
  • (2022)On inter-operator data transfers in query processing2022 IEEE 38th International Conference on Data Engineering (ICDE)10.1109/ICDE53745.2022.00066(820-832)Online publication date: May-2022
  • (2021)Making Compiling Query Engines PracticalIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2019.290523533:2(597-612)Online publication date: 1-Feb-2021
  • (2020)Qd-tree: Learning Data Layouts for Big Data AnalyticsProceedings of the 2020 ACM SIGMOD International Conference on Management of Data10.1145/3318464.3389770(193-208)Online publication date: 11-Jun-2020
  • (2019)Optimal column layout for hybrid workloadsProceedings of the VLDB Endowment10.14778/3358701.335870712:13(2393-2407)Online publication date: 1-Sep-2019
  • (2019)FishStoreProceedings of the 2019 International Conference on Management of Data10.1145/3299869.3319896(1711-1728)Online publication date: 25-Jun-2019
  • (2018)QuickstepProceedings of the VLDB Endowment10.5555/3199517.319951811:6(663-676)Online publication date: 1-Feb-2018
  • (2018)QuickstepProceedings of the VLDB Endowment10.14778/3199517.319951811:6(663-676)Online publication date: 1-Feb-2018
  • (2018)QuickstepProceedings of the VLDB Endowment10.14778/3184470.318447111:6(663-676)Online publication date: 1-Feb-2018
  • (2018)Adaptive Execution of Compiled Queries2018 IEEE 34th International Conference on Data Engineering (ICDE)10.1109/ICDE.2018.00027(197-208)Online publication date: Apr-2018
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media