Abstract
With the broad development of the World Wide Web, various kinds of heterogeneous data (including multimedia data) are now available to decision support tasks. A data warehousing approach is often adopted to prepare data for relevant analysis. Data integration and dimensional modeling indeed allow the creation of appropriate analysis contexts. However, the existing data warehousing tools are well-suited to classical, numerical data. They cannot handle complex data. In our approach, we adapt the three main phases of the data warehousing process to complex data. In this paper, we particularly focus on two main steps in complex data warehousing. The first step is data integration. We define a generic UML model that helps representing a wide range of complex data, including their possible semantic properties. Complex data are then stored in XML documents generated by a piece of software we designed. The second important phase we address is the preparation of data for dimensional modeling. We propose an approach that exploits data mining techniques to assist users in building relevant dimensional models.
Similar content being viewed by others
References
BenMessaoud, R., Boussaid, O., Rabaseda, S.: A new OLAP aggregation based on the AHC technique. In: ACM 7th International Workshop on Data Warehousing and OLAP (DOLAP 04), pp.~65–72. Washington DC, USA (2004)
Boussaid, O., Bentayeb, F., Darmont, J.: A multi-agent system-based ETL approach for complex data. In: 10th ISPE International Conference on Concurrent Engineering: Research and Applications (CE 03), pp.~49–52. Madeira Island, Portugal. (2003)
Calvanese, D., Giacomo, G.D., Lenzerini, M., Nardi, D., Rosati, R.: Description logics framework for information integration. Sixth International Conference on Principles of Knowledge Representation and Reasoning (KR’98), pp.~2–13. Trento, Italy (1998)
Chaudhuri S., Dayal U.(1997): An overview of data warehousing and olap technology. SIGMOD Record 26(1): 65–74
Codd, E.: Providing OLAP (on-line analytical processing) to user-analysts: an IT mandate. Tech. rep., E.F. Codd and Associates (1993)
Darmont, J., Boussaid, O. (eds.): Processing and Managing Complex Data for Decision Support. Idea Group Publishing, Hershey, PA, USA (2006)
Darmont, J., Boussaid, O., Bentayeb, F., Rabaseda, S., Zellouf, Y.: Web Multiform Data Structuring for Warehousing, Vol.~22 of Multimedia Systems and Applications, pp.~179–194. Kluwer Academic Publishers. In: Djeraba, C. (ed.) Multimedia Mining: A Highway to Intelligent Multimedia Documents (2003)
Darmont, J., Boussaid, O., Ralaivao, J., Aouiche, K.: An architecture framework for complex data warehouses. 7th International Conference on Enterprise Information Systems (ICEIS 05), Miami, pp.~370–373. USA (2005)
Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.): Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge (1996)
Fayyad, U., Grinstein, G., Wierse, A.: Information Visualization in Data Mining and Knowledge Discovery. Morgan Kaufmann (2001)
Gardarin, G., Simon, E., Verlaine, L.: Temporal view: a tool for real time distributed data bases. In: Distributed Data Sharing Systems (DDSS), pp.~195–202. Parma, Italy (1984)
Goasdoué F., Lattès V., Rousset M.-C.(2000): The use of CARIN language and algorithms for Information Integration: The PICSEL Project. Int. J. Coop. Inform. Syst. 9(4): 383–401
Grabczewski, E., Cosmas, J., Santen, P.V., Green, D., Itagaki, T., Weimer, F.: 3D MURALE: multimedia database system architecture. In: 2001 Conference on Virtual Reality, Archeology, and Cultural Heritage, pp.~315–322. Glyfada, Greece (2001)
Haralick, R., Shanmugan, K., Dinstein, I.: Texture Features for Image Classification, Vol.~3 of Man and Cybernetics, pp.~610–622. IEEE Transactions Systems (1973)
Inmon, W.: Building the Data Warehouse, 4th edn. John Wiley and Sons (2005)
Jagadish, H.V., Lakshmanan, L.V.S., Srivastava, D.: What can hierarchies do for data warehouses? In: 25th International Conference on Very Large Data Bases (VLDB’99), pp. 530–541. Edinburgh, Scotland, UK (1999)
Jaimes, A., Tseng, B.L., Smith, J.R.: Modal keywords, ontologies, and reasoning for video understanding. In: Image and Video Retrieval, Second International Conference, CIVR 2003, pp. 248–259. Urbana-Champaign, IL, USA (2003)
Jensen, M.R., Møller, T.H., Pedersen, T.B.: Specifying olap cubes on xml data. In: 13th International Conference on Scientific and Statistical Database Management, pp.~101–112. Fairfax, Virginia, USA (2001)
Kimball, R., Merz, R.: The Data Webhouse. Eyrolles (2000)
Kimball, R., Ross, M.: The Data Warehouse Toolkit. John Wiley and Sons (2002)
Lassila, O., Swick, R.: RDF Model and Syntax Specification, http://www.w3.org/TR/REC-rdf-syntax/ (1999)
Manjunath, B., Salembier, P., Sikora, T.: Introducton to MPEG-7: Multimedia Content Description Interface. Wiley (2002)
Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)
Rousset, M.: Knowledge representation for information integration. In: XIIIth International Symposium on Methodologies for Intelligent Systems (ISMIS 2002), Lyon, France, Vol. 2366 of LNAI, pp.~1–3 (2002)
Rousset M., Reynaud C.(2004): Knowledge representation for information integration. Inf. Syst. 1(29): 3–22
Saad, K.: Information-based medicine: a new era in patient care. ACM 7th International Workshop on Data Warehousing and OLAP (DOLAP 04), p.~58. Washington, USA (2004)
Scuturici, M.: Contribution to object oriented techniques in the management of video sequences for Web servers, PhD thesis, INSA Lyon, France (2002)
Stoffel, K., Saltz, J., Hendler, J., Dick, J., Merz, W., Miller, R.: Semantic indexing for complex patient grouping. In: Annual Conference of the American Medical Informatics Association (1997)
Stöhr, T., Müller, R., Rahm, E.: An integrative and uniform model for metadata management in data warehousing environment. In: 5th ACM international workshop on Data Warehousing and OLAP (DOLAP02), pp. 35–42. McLean, USA (2002)
Tanasescu, A., Boussaid, O.: CDO2XML - Complex Data Object to XML - XML documents generation prototype, http://bat710.univ-lyon1.fr/∼atanases/CDO/install_en.zip (2003)
Widom, J.: Research problems in data warehousing. In: 1995 International Conference on Information and Knowledge Management (CIKM’95), pp. 25–30. Baltimore, Maryland, USA (1995)
Witten, I., Frank, E. (eds.): Data Mining: Practical Machine Learning Tools and Techniques. 2nd edn. Morgan Kaufmann (2005)
Wu, M.-C., Buchmann, A.P.: Research issues in data warehousing. In: Datenbanksysteme für Business, Technologie und Web (BTW’97), pp. 61–82. Ulm, Germany (1997)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Boussaid, O., Tanasescu, A., Bentayeb, F. et al. Integration and dimensional modeling approaches for complex data warehousing. J Glob Optim 37, 571–591 (2007). https://doi.org/10.1007/s10898-006-9064-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-006-9064-6