Abstract
Data warehouses are traditionally refreshed in a periodic manner, most often on a daily basis. Thus, there is some delay between a business transaction and its appearance in the data warehouse. The most recent data is trapped in the operational sources where it is unavailable for analysis. For timely decision making, today’s business users asks for ever fresher data.
Near real-time data warehousing addresses this challenge by shortening the data warehouse refreshment intervals and hence, delivering source data to the data warehouse with lower latency. One consequence is that data warehouse refreshment can no longer be performed in off-peak hours only. In particular, the source data may be changed concurrently to data warehouse refreshment. In this paper we show that anomalies may arise under these circumstances leading to an inconsistent state of the data warehouse and we propose approaches to avoid refreshment anomalies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, D., EI Abbadi, A., Singh, A., Yurek, T.: Efficient View Maintenance at Data Warehouses. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 417–427 (1997)
Gupta, A., Jagadish, H.V., Mumick, I.S.: Data Integration using Self-Maintainable Views. In: Proceedings of the 5th International Conference on Extending Database, pp. 140–144 (1996)
Jörg, T., Dessloch, S.: Towards generating ETL processes for incremental loading. In: Proceedings of the 12th International Database Engineering and Applications Symposium, pp. 101–110 (2008)
Jörg, T., Dessloch, S.: Formalizing ETL Jobs for Incremental Loading of Data Warehouses. In: Proceedings der 13. GI-Fachtagung für Datenbanksysteme in Business, Technologie und Web. Lecture Notes in Informatics, vol. 144, pp. 327–346 (2009)
Griffin, T., Libkin, L.: Incremental Maintenance of Views with Duplicates. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 328–339 (1995)
Kimball, R., Caserta, J.: The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data. John Wiley & Sons, Chichester (2004)
Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling. Wiley, Chichester (2002)
Labio, W., Garcia-Molina, H.: Efficient Snapshot Differential Algorithms for Data Warehousing. In: Proceedings of 22th International Conference on Very Large Data Bases, pp. 63–74 (1996)
Manthey, R.: Reflections on Some Fundamental Issues of Rule-based Incremental Update Propagation. In: Proceedings of the 5th International Workshop on the Deductive Approach to Information Systems and Databases, pp. 255–276 (1994)
Qian, X., Wiederhold, G.: Incremental Recomputation of Active Relational Expressions. IEEE Transactions on Knowledge and Data Engineering 3, 337–341 (1991)
Simitsis, A.: Modeling and managing ETL processes. In: Proceedings of the VLDB PhD Workshop (2003)
Simitsis, A.: Mapping conceptual to logical models for ETL processes. In: ACM 8th International Workshop on Data Warehousing and OLAP, pp. 67–76 (2005)
Simitsis, A., Vassiliadis, P., Sellis, T.K.: Optimizing ETL Processes in Data Warehouses. In: Proceedings of the 21st International Conference on Data Engineering, pp. 564–575 (2005)
Widom, J.: Research Problems in Data Warehousing. In: Proceedings of the International Conference on Information and Knowledge Management, pp. 25–30 (1995)
Zhuge, Y., Garcia-Molina, H., Hammer, J., Widom, J.: View Maintenance in a Warehousing Environment. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 316–327 (1995)
Zhuge, Y., Garcia-Molina, H., Wiener, J.L.: Consistency Algorithms for Multi-Source Warehouse View Maintenance. Distributed and Parallel Databases 6(1), 7–40 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jörg, T., Dessloch, S. (2010). Near Real-Time Data Warehousing Using State-of-the-Art ETL Tools. In: Castellanos, M., Dayal, U., Miller, R.J. (eds) Enabling Real-Time Business Intelligence. BIRTE 2009. Lecture Notes in Business Information Processing, vol 41. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14559-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-14559-9_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14558-2
Online ISBN: 978-3-642-14559-9
eBook Packages: Computer ScienceComputer Science (R0)