Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Integration and dimensional modeling approaches for complex data warehousing

  • Original Paper
  • Published:
Journal of Global Optimization Aims and scope Submit manuscript

Abstract

With the broad development of the World Wide Web, various kinds of heterogeneous data (including multimedia data) are now available to decision support tasks. A data warehousing approach is often adopted to prepare data for relevant analysis. Data integration and dimensional modeling indeed allow the creation of appropriate analysis contexts. However, the existing data warehousing tools are well-suited to classical, numerical data. They cannot handle complex data. In our approach, we adapt the three main phases of the data warehousing process to complex data. In this paper, we particularly focus on two main steps in complex data warehousing. The first step is data integration. We define a generic UML model that helps representing a wide range of complex data, including their possible semantic properties. Complex data are then stored in XML documents generated by a piece of software we designed. The second important phase we address is the preparation of data for dimensional modeling. We propose an approach that exploits data mining techniques to assist users in building relevant dimensional models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • BenMessaoud, R., Boussaid, O., Rabaseda, S.: A new OLAP aggregation based on the AHC technique. In: ACM 7th International Workshop on Data Warehousing and OLAP (DOLAP 04), pp.~65–72. Washington DC, USA (2004)

  • Boussaid, O., Bentayeb, F., Darmont, J.: A multi-agent system-based ETL approach for complex data. In: 10th ISPE International Conference on Concurrent Engineering: Research and Applications (CE 03), pp.~49–52. Madeira Island, Portugal. (2003)

  • Calvanese, D., Giacomo, G.D., Lenzerini, M., Nardi, D., Rosati, R.: Description logics framework for information integration. Sixth International Conference on Principles of Knowledge Representation and Reasoning (KR’98), pp.~2–13. Trento, Italy (1998)

  • Chaudhuri S., Dayal U.(1997): An overview of data warehousing and olap technology. SIGMOD Record 26(1): 65–74

    Article  Google Scholar 

  • Codd, E.: Providing OLAP (on-line analytical processing) to user-analysts: an IT mandate. Tech. rep., E.F. Codd and Associates (1993)

  • Darmont, J., Boussaid, O. (eds.): Processing and Managing Complex Data for Decision Support. Idea Group Publishing, Hershey, PA, USA (2006)

  • Darmont, J., Boussaid, O., Bentayeb, F., Rabaseda, S., Zellouf, Y.: Web Multiform Data Structuring for Warehousing, Vol.~22 of Multimedia Systems and Applications, pp.~179–194. Kluwer Academic Publishers. In: Djeraba, C. (ed.) Multimedia Mining: A Highway to Intelligent Multimedia Documents (2003)

  • Darmont, J., Boussaid, O., Ralaivao, J., Aouiche, K.: An architecture framework for complex data warehouses. 7th International Conference on Enterprise Information Systems (ICEIS 05), Miami, pp.~370–373. USA (2005)

  • Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.): Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge (1996)

  • Fayyad, U., Grinstein, G., Wierse, A.: Information Visualization in Data Mining and Knowledge Discovery. Morgan Kaufmann (2001)

  • Gardarin, G., Simon, E., Verlaine, L.: Temporal view: a tool for real time distributed data bases. In: Distributed Data Sharing Systems (DDSS), pp.~195–202. Parma, Italy (1984)

  • Goasdoué F., Lattès V., Rousset M.-C.(2000): The use of CARIN language and algorithms for Information Integration: The PICSEL Project. Int. J. Coop. Inform. Syst. 9(4): 383–401

    Article  Google Scholar 

  • Grabczewski, E., Cosmas, J., Santen, P.V., Green, D., Itagaki, T., Weimer, F.: 3D MURALE: multimedia database system architecture. In: 2001 Conference on Virtual Reality, Archeology, and Cultural Heritage, pp.~315–322. Glyfada, Greece (2001)

  • Haralick, R., Shanmugan, K., Dinstein, I.: Texture Features for Image Classification, Vol.~3 of Man and Cybernetics, pp.~610–622. IEEE Transactions Systems (1973)

  • Inmon, W.: Building the Data Warehouse, 4th edn. John Wiley and Sons (2005)

  • Jagadish, H.V., Lakshmanan, L.V.S., Srivastava, D.: What can hierarchies do for data warehouses? In: 25th International Conference on Very Large Data Bases (VLDB’99), pp. 530–541. Edinburgh, Scotland, UK (1999)

  • Jaimes, A., Tseng, B.L., Smith, J.R.: Modal keywords, ontologies, and reasoning for video understanding. In: Image and Video Retrieval, Second International Conference, CIVR 2003, pp. 248–259. Urbana-Champaign, IL, USA (2003)

  • Jensen, M.R., Møller, T.H., Pedersen, T.B.: Specifying olap cubes on xml data. In: 13th International Conference on Scientific and Statistical Database Management, pp.~101–112. Fairfax, Virginia, USA (2001)

  • Kimball, R., Merz, R.: The Data Webhouse. Eyrolles (2000)

  • Kimball, R., Ross, M.: The Data Warehouse Toolkit. John Wiley and Sons (2002)

  • Lassila, O., Swick, R.: RDF Model and Syntax Specification, http://www.w3.org/TR/REC-rdf-syntax/ (1999)

  • Manjunath, B., Salembier, P., Sikora, T.: Introducton to MPEG-7: Multimedia Content Description Interface. Wiley (2002)

  • Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)

  • Rousset, M.: Knowledge representation for information integration. In: XIIIth International Symposium on Methodologies for Intelligent Systems (ISMIS 2002), Lyon, France, Vol. 2366 of LNAI, pp.~1–3 (2002)

  • Rousset M., Reynaud C.(2004): Knowledge representation for information integration. Inf. Syst. 1(29): 3–22

    Article  Google Scholar 

  • Saad, K.: Information-based medicine: a new era in patient care. ACM 7th International Workshop on Data Warehousing and OLAP (DOLAP 04), p.~58. Washington, USA (2004)

  • Scuturici, M.: Contribution to object oriented techniques in the management of video sequences for Web servers, PhD thesis, INSA Lyon, France (2002)

  • Stoffel, K., Saltz, J., Hendler, J., Dick, J., Merz, W., Miller, R.: Semantic indexing for complex patient grouping. In: Annual Conference of the American Medical Informatics Association (1997)

  • Stöhr, T., Müller, R., Rahm, E.: An integrative and uniform model for metadata management in data warehousing environment. In: 5th ACM international workshop on Data Warehousing and OLAP (DOLAP02), pp. 35–42. McLean, USA (2002)

  • Tanasescu, A., Boussaid, O.: CDO2XML - Complex Data Object to XML - XML documents generation prototype, http://bat710.univ-lyon1.fr/∼atanases/CDO/install_en.zip (2003)

  • Widom, J.: Research problems in data warehousing. In: 1995 International Conference on Information and Knowledge Management (CIKM’95), pp. 25–30. Baltimore, Maryland, USA (1995)

  • Witten, I., Frank, E. (eds.): Data Mining: Practical Machine Learning Tools and Techniques. 2nd edn. Morgan Kaufmann (2005)

  • Wu, M.-C., Buchmann, A.P.: Research issues in data warehousing. In: Datenbanksysteme für Business, Technologie und Web (BTW’97), pp. 61–82. Ulm, Germany (1997)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to O. Boussaid.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Boussaid, O., Tanasescu, A., Bentayeb, F. et al. Integration and dimensional modeling approaches for complex data warehousing. J Glob Optim 37, 571–591 (2007). https://doi.org/10.1007/s10898-006-9064-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10898-006-9064-6

Keywords

Navigation