Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Everything you always wanted to know about compiled and vectorized queries but were afraid to ask

Published: 01 September 2018 Publication History

Abstract

The query engines of most modern database systems are either based on vectorization or data-centric code generation. These two state-of-the-art query processing paradigms are fundamentally different in terms of system structure and query execution code. Both paradigms were used to build fast systems. However, until today it is not clear which paradigm yields faster query execution, as many implementation-specific choices obstruct a direct comparison of architectures. In this paper, we experimentally compare the two models by implementing both within the same test system. This allows us to use for both models the same query processing algorithms, the same data structures, and the same parallelization framework to ultimately create an apples-to-apples comparison. We find that both are efficient, but have different strengths and weaknesses. Vectorization is better at hiding cache miss latency, whereas data-centric compilation requires fewer CPU instructions, which benefits cache-resident workloads. Besides raw, single-threaded performance, we also investigate SIMD as well as multi-core parallelization and different hardware architectures. Finally, we analyze qualitative differences as a guide for system architects.

References

[1]
https://stackoverflow.com/questions/36932240/avx2-what-is-the-most-efficient-way-to-pack-left-based-on-a-mask, 2016.
[2]
S. Agarwal, D. Liu, and R. Xin. Apache Spark as a compiler: Joining a billion rows per second on a laptop. "https://databricks.com/blog/2016/05/23/apache-spark-as-a-compiler-joining-a-billion-rows-per-second-on-a-laptop.html", 2016.
[3]
K. Anikiej. Multi-core parallelization of vectorized query execution. Master's thesis, University of Warsaw and VU University Amsterdam, 2010. http://homepages.cwi.nl/~boncz/msc/2010-KamilAnikijej.pdf.
[4]
R. Appuswamy, A. Anadiotis, D. Porobic, M. Iman, and A. Ailamaki. Analyzing the impact of system architecture on the scalability of OLTP engines for high-contention workloads. PVLDB, 11(2):121--134, 2017.
[5]
C. Balkesen, J. Teubner, G. Alonso, and M. T. Özsu. Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware. In ICDE, pages 362--373, 2013.
[6]
P. Boncz, T. Neumann, and O. Erling. TPC-H analyzed: Hidden messages and lessons learned from an influential benchmark. In TPCTC, 2013.
[7]
P. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Hyper-pipelining query execution. In CIDR, 2005.
[8]
P. A. Boncz, O. Erling, and M. Pham. Advances in large-scale RDF data management. In Linked Open Data - Creating Knowledge Out of Interlinked Data - Results of the LOD2 Project, pages 21--44. Springer, 2014.
[9]
P. A. Boncz, M. L. Kersten, and S. Manegold. Breaking the memory wall in MonetDB. Commun. ACM, 51(12):77--85, 2008.
[10]
P. A. Boncz, W. Quak, and M. L. Kersten. Monet and its geographic extensions: A novel approach to high performance GIS processing. In EDBT, pages 147--166, 1996.
[11]
A. Crotty, A. Galakatos, K. Dursun, T. Kraska, C. Binnig, U. Çetintemel, and S. Zdonik. An architecture for compiling UDF-centric workflows. PVLDB, 8(12):1466--1477, 2015.
[12]
C. Freedman, E. Ismert, and P. Larson. Compilation in the Microsoft SQL Server Hekaton engine. IEEE Data Eng. Bull., 37(1):22--30, 2014.
[13]
G. Graefe. Encapsulation of parallelism in the Volcano query processing system. In SIGMOD, pages 102--111, 1990.
[14]
G. Graefe and W. J. McKenna. The Volcano optimizer generator: Extensibility and efficient search. In ICDE, pages 209--218, 1993.
[15]
T. Gubner and P. Boncz. Exploring query compilation strategies for JIT, vectorization and SIMD. In IMDM, 2017.
[16]
T. Kersten. https://github.com/TimoKersten/db-engine-paradigms, 2018.
[17]
A. Kohn, V. Leis, and T. Neumann. Adaptive execution of compiled queries. In ICDE, 2018.
[18]
K. Krikellas, S. Viglas, and M. Cintra. Generating code for holistic query evaluation. In ICDE, pages 613--624, 2010.
[19]
H. Lang, T. Muhlbauer, F. Funke, P. Boncz, T. Neumann, and A. Kemper. Data Blocks: Hybrid OLTP and OLAP on compressed storage using both vectorization and compilation. In SIGMOD, pages 311--326, 2016.
[20]
P. Larson, C. Clinciu, C. Fraser, E. N. Hanson, M. Mokhtar, M. Nowakiewicz, V. Papadimos, S. L. Price, S. Rangarajan, R. Rusanu, and M. Saubhasik. Enhancements to SQL Server column stores. In SIGMOD, pages 1159--1168, 2013.
[21]
P. Larson, C. Clinciu, E. N. Hanson, A. Oks, S. L. Price, S. Rangarajan, A. Surna, and Q. Zhou. SQL Server column store indexes. In SIGMOD, pages 1177--1184, 2011.
[22]
V. Leis, P. Boncz, A. Kemper, and T. Neumann. Morsel-driven parallelism: A NUMA-aware query evaluation framework for the many-core age. In SIGMOD, pages 743--754, 2014.
[23]
V. Leis, K. Kundhikanjana, A. Kemper, and T. Neumann. Efficient processing of window functions in analytical SQL queries. PVLDB, 8(10):1058--1069, 2015.
[24]
V. Leis, B. Radke, A. Gubichev, A. Mirchev, P. Boncz, A. Kemper, and T. Neumann. Query optimization through the looking glass, and what we found running the join order benchmark. VLDB Journal, 2018.
[25]
R. A. Lorie. XRM - an extended (n-ary) relational memory. IBM Research Report, G320-2096, 1974.
[26]
P. Menon, A. Pavlo, and T. Mowry. Relaxed operator fusion for in-memory databases: Making compilation, vectorization, and prefetching work together at last. PVLDB, 11(1):1--13, 2017.
[27]
T. Neumann. Efficient generation and execution of DAG-structured query graphs. PhD thesis, University of Mannheim, 2005.
[28]
T. Neumann. Efficiently compiling efficient query plans for modern hardware. PVLDB, 4(9):539--550, 2011.
[29]
T. Neumann and V. Leis. Compiling database queries into machine code. IEEE Data Eng. Bull., 37(1):3--11, 2014.
[30]
S. Padmanabhan, T. Malkemus, R. C. Agarwal, and A. Jhingran. Block oriented processing of relational database operations in modern computer architectures. In ICDE, pages 567--574, 2001.
[31]
S. Palkar, J. J. Thomas, D. Narayanan, A. Shanbhag, R. Palamuttam, H. Pirk, M. Schwarzkopf, S. P. Amarasinghe, S. Madden, and M. Zaharia. Weld: Rethinking the interface between data-intensive applications. CoRR, abs/1709.06416, 2017.
[32]
S. Palkar, J. J. Thomas, A. Shanbhag, M. Schwarzkopt, S. P. Amarasinghe, and M. Zaharia. A common runtime for high performance data analysis. In CIDR, 2017.
[33]
J. M. Patel, H. Deshmukh, J. Zhu, H. Memisoglu, N. Potti, S. Saurabh, M. Spehlmann, and Z. Zhang. Quickstep: A data platform based on the scaling-in approach. PVLDB, 11(6):663--676, 2018.
[34]
H. Pirk, O. Moll, M. Zaharia, and S. Madden. Voodoo - A vector algebra for portable database performance on modern hardware. PVLDB, 9(14):1707--1718, 2016.
[35]
O. Polychroniou, A. Raghavan, and K. A. Ross. Rethinking SIMD vectorization for in-memory databases. In SIGMOD, pages 1493--1508, 2005.
[36]
O. Polychroniou and K. A. Ross. High throughput heavy hitter aggregation for modern SIMD processors. In DaMoN, 2013.
[37]
O. Polychroniou and K. A. Ross. Vectorized bloom filters for advanced SIMD processors. In DaMoN, 2014.
[38]
O. Polychroniou and K. A. Ross. Efficient lightweight compression alongside fast scans. In DaMoN, 2015.
[39]
B. Raducanu, P. Boncz, and M. Zukowski. Micro adaptivity in Vectorwise. In SIGMOD, pages 1231--1242, 2013.
[40]
V. Raman, G. K. Attaluri, R. Barber, N. Chainani, D. Kalmuk, V. KulandaiSamy, J. Leenstra, S. Lightstone, S. Liu, G. M. Lohman, T. Malkemus, R. Müller, I. Pandis, B. Schiefer, D. Sharpe, R. Sidle, A. J. Storm, and L. Zhang. DB2 with BLU acceleration: So much more than just a column store. PVLDB, 6(11):1080--1091, 2013.
[41]
S. Schuh, X. Chen, and J. Dittrich. An experimental comparison of thirteen relational equi-joins in main memory. In SIGMOD, pages 1961--1976, 2016.
[42]
A. Shaikhha, Y. Klonatos, L. Parreaux, L. Brown, M. Dashti, and C. Koch. How to architect a query compiler. In SIGMOD, pages 1907--1922, 2016.
[43]
U. Sirin, P. Tözün, D. Porobic, and A. Ailamaki. Micro-architectural analysis of in-memory OLTP. In SIGMOD, pages 387--402, 2016.
[44]
E. A. Sitaridi, O. Polychroniou, and K. A. Ross. SIMD-accelerated regular expression matching. In DaMoN, 2016.
[45]
J. Sompolski, M. Zukowski, and P. A. Boncz. Vectorization vs. compilation in query execution. In DaMoN, pages 33--40, 2011.
[46]
D. Song and S. Chen. Exploiting SIMD for complex numerical predicates. In ICDE, pages 143--149, 2016.
[47]
R. Y. Tahboub, G. M. Essertel, and T. Rompf. How to architect a query compiler, revisited. In SIGMOD, pages 307--322, 2018.
[48]
A. Vogelsgesang, M. Haubenschild, J. Finis, A. Kemper, V. Leis, T. Mühlbauer, T. Neumann, and M. Then. Get real: How benchmarks fail to represent the real world. In DBTest, 2018.
[49]
S. Wanderman-Milne and N. Li. Runtime code generation in Cloudera Impala. IEEE Data Eng. Bull., 37(1):31--37, 2014.
[50]
T. Willhalm, N. Popovici, Y. Boshmaf, H. Plattner, A. Zeier, and J. Schaffner. SIMD-Scan: Ultra fast in-memory table scan using on-chip vector processing units. PVLDB, 2(1):385--394, 2009.
[51]
J. Zhou and K. A. Ross. Implementing database operations using SIMD instructions. In SIGMOD, pages 145--156, 2002.
[52]
M. Zukowski. Balancing Vectorized Query Execution with Bandwidth-Optimized Storage. PhD thesis, University of Amsterdam, 2009.
[53]
M. Zukowski, S. Héman, N. Nes, and P. A. Boncz. Super-scalar RAM-CPU cache compression. In ICDE, 2006.

Cited By

View all
  • (2024)Composable Data Management: An Execution OverviewProceedings of the VLDB Endowment10.14778/3685800.368584717:12(4249-4252)Online publication date: 1-Aug-2024
  • (2024)Simple (yet Efficient) Function Authoring for Vectorized EnginesProceedings of the VLDB Endowment10.14778/3685800.368583617:12(4187-4199)Online publication date: 1-Aug-2024
  • (2024)ClickHouse - Lightning Fast Analytics for EveryoneProceedings of the VLDB Endowment10.14778/3685800.368580217:12(3731-3744)Online publication date: 1-Aug-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 11, Issue 13
September 2018
218 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 September 2018
Published in PVLDB Volume 11, Issue 13

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)208
  • Downloads (Last 6 weeks)16
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Composable Data Management: An Execution OverviewProceedings of the VLDB Endowment10.14778/3685800.368584717:12(4249-4252)Online publication date: 1-Aug-2024
  • (2024)Simple (yet Efficient) Function Authoring for Vectorized EnginesProceedings of the VLDB Endowment10.14778/3685800.368583617:12(4187-4199)Online publication date: 1-Aug-2024
  • (2024)ClickHouse - Lightning Fast Analytics for EveryoneProceedings of the VLDB Endowment10.14778/3685800.368580217:12(3731-3744)Online publication date: 1-Aug-2024
  • (2024)Hardware-Efficient Data Imputation through DBMS ExtensibilityProceedings of the VLDB Endowment10.14778/3681954.368201617:11(3497-3510)Online publication date: 1-Jul-2024
  • (2024)WeBridge: Synthesizing Stored Procedures for Large-Scale Real-World Web ApplicationsProceedings of the ACM on Management of Data10.1145/36393192:1(1-29)Online publication date: 26-Mar-2024
  • (2024)Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query EngineCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653368(5-17)Online publication date: 9-Jun-2024
  • (2024)QuartetExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.122841244:COnline publication date: 15-Jun-2024
  • (2023)Big Data Analytic Toolkit: A General-Purpose, Modular, and Heterogeneous Acceleration Toolkit for Data Analytical EnginesProceedings of the VLDB Endowment10.14778/3611540.361155816:12(3702-3714)Online publication date: 1-Aug-2023
  • (2023)ALP: Adaptive Lossless floating-Point CompressionProceedings of the ACM on Management of Data10.1145/36267171:4(1-26)Online publication date: 12-Dec-2023
  • (2023)Indexed Streams: A Formal Intermediate Representation for Fused Contraction ProgramsProceedings of the ACM on Programming Languages10.1145/35912687:PLDI(1169-1193)Online publication date: 6-Jun-2023
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media