Nothing Special   »   [go: up one dir, main page]

Skip to main content

Near Real-Time Data Warehousing Using State-of-the-Art ETL Tools

  • Conference paper
Enabling Real-Time Business Intelligence (BIRTE 2009)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 41))

Abstract

Data warehouses are traditionally refreshed in a periodic manner, most often on a daily basis. Thus, there is some delay between a business transaction and its appearance in the data warehouse. The most recent data is trapped in the operational sources where it is unavailable for analysis. For timely decision making, today’s business users asks for ever fresher data.

Near real-time data warehousing addresses this challenge by shortening the data warehouse refreshment intervals and hence, delivering source data to the data warehouse with lower latency. One consequence is that data warehouse refreshment can no longer be performed in off-peak hours only. In particular, the source data may be changed concurrently to data warehouse refreshment. In this paper we show that anomalies may arise under these circumstances leading to an inconsistent state of the data warehouse and we propose approaches to avoid refreshment anomalies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Agrawal, D., EI Abbadi, A., Singh, A., Yurek, T.: Efficient View Maintenance at Data Warehouses. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 417–427 (1997)

    Google Scholar 

  2. Gupta, A., Jagadish, H.V., Mumick, I.S.: Data Integration using Self-Maintainable Views. In: Proceedings of the 5th International Conference on Extending Database, pp. 140–144 (1996)

    Google Scholar 

  3. Jörg, T., Dessloch, S.: Towards generating ETL processes for incremental loading. In: Proceedings of the 12th International Database Engineering and Applications Symposium, pp. 101–110 (2008)

    Google Scholar 

  4. Jörg, T., Dessloch, S.: Formalizing ETL Jobs for Incremental Loading of Data Warehouses. In: Proceedings der 13. GI-Fachtagung für Datenbanksysteme in Business, Technologie und Web. Lecture Notes in Informatics, vol. 144, pp. 327–346 (2009)

    Google Scholar 

  5. Griffin, T., Libkin, L.: Incremental Maintenance of Views with Duplicates. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 328–339 (1995)

    Google Scholar 

  6. Kimball, R., Caserta, J.: The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data. John Wiley & Sons, Chichester (2004)

    Google Scholar 

  7. Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling. Wiley, Chichester (2002)

    Google Scholar 

  8. Labio, W., Garcia-Molina, H.: Efficient Snapshot Differential Algorithms for Data Warehousing. In: Proceedings of 22th International Conference on Very Large Data Bases, pp. 63–74 (1996)

    Google Scholar 

  9. Manthey, R.: Reflections on Some Fundamental Issues of Rule-based Incremental Update Propagation. In: Proceedings of the 5th International Workshop on the Deductive Approach to Information Systems and Databases, pp. 255–276 (1994)

    Google Scholar 

  10. Qian, X., Wiederhold, G.: Incremental Recomputation of Active Relational Expressions. IEEE Transactions on Knowledge and Data Engineering 3, 337–341 (1991)

    Article  Google Scholar 

  11. Simitsis, A.: Modeling and managing ETL processes. In: Proceedings of the VLDB PhD Workshop (2003)

    Google Scholar 

  12. Simitsis, A.: Mapping conceptual to logical models for ETL processes. In: ACM 8th International Workshop on Data Warehousing and OLAP, pp. 67–76 (2005)

    Google Scholar 

  13. Simitsis, A., Vassiliadis, P., Sellis, T.K.: Optimizing ETL Processes in Data Warehouses. In: Proceedings of the 21st International Conference on Data Engineering, pp. 564–575 (2005)

    Google Scholar 

  14. Widom, J.: Research Problems in Data Warehousing. In: Proceedings of the International Conference on Information and Knowledge Management, pp. 25–30 (1995)

    Google Scholar 

  15. Zhuge, Y., Garcia-Molina, H., Hammer, J., Widom, J.: View Maintenance in a Warehousing Environment. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 316–327 (1995)

    Google Scholar 

  16. Zhuge, Y., Garcia-Molina, H., Wiener, J.L.: Consistency Algorithms for Multi-Source Warehouse View Maintenance. Distributed and Parallel Databases 6(1), 7–40 (1998)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jörg, T., Dessloch, S. (2010). Near Real-Time Data Warehousing Using State-of-the-Art ETL Tools. In: Castellanos, M., Dayal, U., Miller, R.J. (eds) Enabling Real-Time Business Intelligence. BIRTE 2009. Lecture Notes in Business Information Processing, vol 41. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14559-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14559-9_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14558-2

  • Online ISBN: 978-3-642-14559-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics