Abstract
Today’s data warehousing requires continuous or on-demand data integration through a Change-Data-Capture (CDC) process to extract data deltas from Online Transaction Processing Systems. This paper proposes a workload-aware CDC framework for on-demand data warehousing. This framework adopts three CDC strategies, namely trigger-based, timestamp-based and log-based, which allows capturing data deltas by taking into account the workloads of source systems. This paper evaluates the framework comprehensively, and the results demonstrate its effectiveness in terms of quality of service, including throughput, latency and staleness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hamdi, I., Bouazizi, E., Alshomrani, S., Feki, J.: Improving QoS in real-time data warehouses by using feedback control scheduling. J. Inf. Deci. Sci. 10(3), 181–211 (2018)
Kozielski, S., Wrembel, R. (eds): New Trends in Data Warehousing and Data Analysis. Springer (2009). https://doi.org/10.1007/978-0-387-87431-9
Liu, X.: Data warehousing technologies for large-scale and right-time data. Aalborg University, Defensed on June (2012)
Liu, X., Iftikhar, N.: An ETL optimization framework using partitioning and parallelization. In: Proceedings of the 30th SAC, pp. 1015–1022 (2015)
Shi, J., Guo, S., Luan, F., Sun, L.: Qos-ls: Qos-based load scheduling algorithm in real-time data warehouse. In: Proceedings of the 5th ICMMCT (2017)
Thiele, M., Fischer, U., Lehner, W.: Partition-based workload scheduling in living data warehouse environments. Inf. Syst. 34(4–5), 382–399 (2009)
Thomsen, C., Pedersen, T.B., Lehner, W.: Rite: providing on-demand data for right-time data warehousing. In: Proceedings of the 24th ICDE, pp. 456–465 (2008)
Vassiliadis, P., Simitsis, A.: Near real time ETL. In: New Trends in Data Warehousing and Data Analysis, pp. 1–31. Springer (2009). https://doi.org/10.1007/978-0-387-87431-9
Acknowledgement
This research was supported by the HEAT 4.0 project (8090-00046b) funded by Innovationsfonden.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Qu, W., Liu, X., Dessloch, S. (2021). A Workload-Aware Change Data Capture Framework for Data Warehousing. In: Golfarelli, M., Wrembel, R., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2021. Lecture Notes in Computer Science(), vol 12925. Springer, Cham. https://doi.org/10.1007/978-3-030-86534-4_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-86534-4_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86533-7
Online ISBN: 978-3-030-86534-4
eBook Packages: Computer ScienceComputer Science (R0)