HTAP databases: What is new and what is next
Proceedings of the 2022 International Conference on Management of Data, 2022•dl.acm.org
Processing the mixed workloads of transactions and analytical queries in a single database
system can eliminate the ETL process and enable real-time data analysis on the transaction
data. However, there is no free lunch. Such systems must balance the trade-off between
workload isolation and data freshness due to interweaving workloads of OLTP and OLAP.
Since Gartner coined the term, Hybrid Transactional/Analytical Processing (HTAP), we have
witnessed the emergence of various database systems to support HTAP. One common …
system can eliminate the ETL process and enable real-time data analysis on the transaction
data. However, there is no free lunch. Such systems must balance the trade-off between
workload isolation and data freshness due to interweaving workloads of OLTP and OLAP.
Since Gartner coined the term, Hybrid Transactional/Analytical Processing (HTAP), we have
witnessed the emergence of various database systems to support HTAP. One common …
Processing the mixed workloads of transactions and analytical queries in a single database system can eliminate the ETL process and enable real-time data analysis on the transaction data. However, there is no free lunch. Such systems must balance the trade-off between workload isolation and data freshness due to interweaving workloads of OLTP and OLAP. Since Gartner coined the term, Hybrid Transactional/Analytical Processing (HTAP), we have witnessed the emergence of various database systems to support HTAP. One common feature is that they leverage the best of row store and column store to achieve high quality of HTAP. As they have disparate storage strategies and processing techniques to satisfy the requirements of various HTAP applications, it is essential to understand, compare, and evaluate their key techniques. In this tutorial, we offer a comprehensive survey of HTAP databases. We introduce a taxonomy of state-of-the-art HTAP databases according to their storage strategies and architectures. We then take a deep dive into their key techniques regarding transaction processing, analytical processing, data synchronization, query optimization, and resource scheduling. We also introduce existing HTAP benchmarks. Finally, we discuss the research challenges and open problems for HTAP.
ACM Digital Library