Abstract
During the last few years many research efforts have been done to improve the design of ETL (Extract-Transform-Load) systems. ETL systems are considered very time-consuming, error-prone and complex involving several participants from different knowledge domains. ETL processes are one of the most important components of a data warehousing system that are strongly influenced by the complexity of business requirements, their changing and evolution. These aspects influence not only the structure of a data warehouse but also the structures of the data sources involved with. To minimize the negative impact of such variables, we propose the use of ETL patterns to build specific ETL packages. In this paper, we formalize this approach using BPMN (Business Process Modelling Language) for modelling more conceptual ETL workflows, mapping them to real execution primitives through the use of a domain-specific language that allows for the generation of specific instances that can be executed in an ETL commercial tool.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
OMG, Documents Associated With Business Process Model And Notation (BPMN) Version 2.0 (2011)
Thomsen, C., Pedersen, T.B.: Pygrametl: a powerful programming framework for extract-transform-load programmers. In: Proceeding of the ACM Twelfth International Workshop on Data Warehousing and OLAP, DOLAP 2009, pp. 49–56 (2009)
Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Conceptual modeling for ETL processes. In: Proceedings of the 5th ACM International Workshop on Data Warehousing and OLAP, DOLAP 2002, pp. 14–21 (2002)
Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: On the logical modeling of ETL processes. In: Pidduck, A., Mylopoulos, J., Woo, C.C., Ozsu, M. (eds.) CAiSE 2002. LNCS, vol. 2348, pp. 782–786. Springer, Heidelberg (2002)
Simitsis, A., Vassiliadis, P.: A method for the mapping of conceptual designs to logical blueprints for ETL processes. Decis. Support Syst. 45, 22–40 (2008)
Vassiliadis, P., Vagena, Z., Skiadopoulos, S., Karayannidis, N., Sellis, T.: Arktos: A Tool for Data Cleaning and Transformation in Data Warehouse Environments. IEEE Data Eng. Bull. 23(4), 42–47 (2000)
Luján-Mora, S., Trujillo, J., Song, I.-Y.: A UML profile for multidimensional modeling in data warehouses. Data Knowl. Eng. 59, 725–769 (2006)
Trujillo, J., Luján-Mora, S.: A UML based approach for modeling ETL processes in data warehouses. Concept. Model. 2813, 307–320 (2003)
Wilkinson, K., Simitsis, A., Castellanos, M., Dayal, U.: Leveraging business process models for ETL design. In: Parsons, J., Saeki, M., Shoval, P., Woo, C., Wand, Y. (eds.) ER 2010. LNCS, vol. 6412, pp. 15–30. Springer, Heidelberg (2010)
El Akkaoui, Z., Zimanyi, E.: Defining ETL worfklows using BPMN and BPEL. In: Proceedings of the ACM Twelfth International Workshop on Data Warehousing and OLAP, DOLAP 2009, pp. 41–48 (2009)
El Akkaoui, Z., Zimànyi, E., Mazón, J.-N., Trujillo, J.: A model-driven framework for ETL process development. In: Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP, DOLAP 2011, pp. 45–52 (2011)
El Akkaoui, Z., Mazón, J.-N., Vaisman, A., Zimányi, E.: BPMN-based conceptual modeling of ETL processes. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 1–14. Springer, Heidelberg (2012)
El Akkaoui, Z., Zimanyi, E., Mazon, J.-N., Trujillo, J.: A BPMN-based design and maintenance framework for ETL processes. Int. J. Data Warehous. Min. 9, 46 (2013)
Rahm, E., Do, H.: Data cleaning: Problems and current approaches. IEEE Data Eng. Bull. 23, 3–13 (2000)
Köppen, V., Brüggemann, B., Berendt, B.: Designing Data Integration: The ETL Pattern Approach. Eur. J. Informatics Prof. XII (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Oliveira, B., Belo, O. (2015). A Domain-Specific Language for ETL Patterns Specification in Data Warehousing Systems. In: Pereira, F., Machado, P., Costa, E., Cardoso, A. (eds) Progress in Artificial Intelligence. EPIA 2015. Lecture Notes in Computer Science(), vol 9273. Springer, Cham. https://doi.org/10.1007/978-3-319-23485-4_60
Download citation
DOI: https://doi.org/10.1007/978-3-319-23485-4_60
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23484-7
Online ISBN: 978-3-319-23485-4
eBook Packages: Computer ScienceComputer Science (R0)