Abstract
Data warehousing and OLAP are mainly used for the analysis of transactional data. Nowadays, with the evolution of Internet, and the development of semi-structured data exchange format (such as XML), it is possible to consider entire fragments of data such as documents as analysis sources. As a consequence, an adapted multidimensional analysis framework needs to be provided. In this paper, we introduce an OLAP multidimensional conceptual model without facts. This model is based on the unique concept of dimensions and is adapted for multidimensional document analysis. We also provide a set of manipulation operations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abelló, A., Samos, J., Saltor, F.: Implementing operations to navigate semantic star schemas. In: 6th ACM int. workshop on Data Warehousing and OLAP (DOLAP), pp. 56–62. ACM, New York (2003)
Agrawal, R., Gupta, A., Sarawagi, S.: Modeling Multidimensional Databases. In: ICDE. Int. Conf. on Data Engineering, pp. 232–243 (1997)
Boussaid, O., Messaoud, R.B., Choquet, R., Anthoard, S.: X-Warehousing: An XML-Based Approach for Warehousing Complex Data. In: Manolopoulos, Y., Pokorný, J., Sellis, T. (eds.) ADBIS 2006. LNCS, vol. 4152, pp. 39–54. Springer, Heidelberg (2006)
Cabibbo, L., Torlone, R.: A Systematic Approach to Multidimensional Databases. In: SEBD. 5th Italian Symposium on Advanced Database Systems, pp. 361–377 (1997)
Fuhr, N., Großjohann, K.: A Query Language for Information Retrieval in XML Documents. In: 24th int. ACM SIGIR conf. on Research and development in information retrieval, pp. 172–180. ACM Press, New York (2001)
Golfarelli, M., Rizzi, S., Saltarelli, E.: WAND: A CASE Tool for Workload-Based Design of a Data Mart. In: SEBD. 10th Italian Symposium on Advanced Database Systems, pp. 422–426 (2002)
Gray, J., Bosworth, A., Layman, A., Pirahesh, H.: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. In: ICDE. 12th Int. Conf. on Data Engineering, pp. 152–159 (1996)
Gyssen, M., Lakshmanan, L.V.S.: A Foundation for Multi-Dimensional Databases. In: VLDB. 23rd Int. Conf. on Very Large Data Bases, pp. 106–115 (1997)
Jensen, M.R., Møller, T.H., Pedersen, T.B.: Specifying OLAP Cubes On XML Data. In: SSDBM. 13th Int. Conf. on Scientific and Statistical Database Management, pp. 101–112. IEEE Computer Society Press, Los Alamitos (2001)
Keith, S., Kaser, O., Lemire, D.: Analyzing Large Collections of Electronic Text Using OLAP. In: APICS 29th Conf. in Mathematics, Statistics and Computer Science, pp. 17–26 (2005)
Kimball, R.: The data warehouse toolkit, 2nd edn. John Wiley and Sons, Chichester (2003)
Khrouf, K., Soulé-Dupuy, C.: A Textual Warehouse Approach: A Web Data Repository. In: Mohammadian, M. (ed.) Intelligent Agents for Data Mining and Information Retrieval, pp. 101–124. Idea Publishing Group (2004)
Malinowski, E., Zimányi, E.: Hierarchies in a multidimensional model: From conceptual modeling to logical representation. J. of Data & Knowledge Engineering (DKE) 59(2), 348–377 (2006)
Mass, Y., Mandelbrod, M.: Component Ranking and Automatic Query Refinement for XML Retireval. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds.) INEX 2004. LNCS, vol. 3493, pp. 73–84. Springer, Heidelberg (2005)
McCabe, C., Lee, J., Chowdhury, A., Grossman, D.A., Frieder, O.: On the design and evaluation of a multi-dimensional approach to information retrieval. In: 23rd Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 363–365. ACM, New York (2000)
Mothe, J., Chrisment, C., Dousset, B., Alau, J.: DocCube: Multi-dimensional visualisation and exploration of large document sets. J. of the American Society for Information Science and Technology (JASIST) 54(7), 650–659 (2003)
Nassis, V., Rajugan, R., Dillon, T.S., Wenny Rahayu, J.: Conceptual Design of XML Document Warehouses. In: Kambayashi, Y., Mohania, M.K., Wöß, W. (eds.) DaWaK 2004. LNCS, vol. 3181, pp. 1–14. Springer, Heidelberg (2004)
Park, B.K., Han, H., Song, I.Y.: XML-OLAP: A Multidimensional Analysis Framework for XML Warehouses. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2005. LNCS, vol. 3589, pp. 32–42. Springer, Heidelberg (2005)
Pérez, J.M., Berlanga-Llavori, R., Aramburu-Cabo, M.J., Pedersen, T.B.: Contextualizing data warehouses with documents. In: Decision Support Systems (DSS), Elsevier, Amsterdam (in press, 2007), doi:10.1016/j.dss.2006.12.005
Rafanelli, M.: Operators for Multidimensional Aggregate Data. In: Rafanelli, M. (ed.) Multidimensional Databases: Problems and Solutions, ch.5, pp. 116–165. Idea Group Inc (2003)
Ravat, F., Teste, O., Tournier, R.: OLAP Aggregation Function for Textual Data Warehouse. In: ICEIS. 9th Int. Conf. on Enterprise Information Systems, pp. 151–156. INSTICC Press (June 2007)
Ravat, F., Teste, O., Tournier, R., Zurfluh, G.: Algebraic and graphic languages for OLAP manipulations. Int. j. of Data Warehousing and Mining (DWM) (to appear, 2007)
Rizzi, S., Abelló, A., Lechtenbörger, J., Trujillo, J.: Research in data warehouse modeling and design: dead or alive? In: DOLAP. 9th ACM Int. Workshop on Data Warehousing and OLAP, pp. 3–10. ACM, New York (2006)
Sullivan, D.: Document Warehousing and Text Mining. Wiley John & Sons, Chichester (2001)
Torlone, R.: Conceptual Multidimensional Models. In: Rafanelli, M. (ed.) Multidimensional Databases: Problems and Solutions, ch.3, pp. 69–90. Idea Group Inc, USA (2003)
Tseng, F.S.C.: Design of a multi-dimensional query expression for document warehouses. Information Sciences 174(1-2), 55–79 (2005)
Tseng, F.S.C., Chou, A.Y.H.: The concept of document warehousing for multi-dimensional modeling of textual-based business intelligence. J. of Decision Support Systems (DSS) 42(2), 727–744 (2006)
Vrdoljak, B., Banek, M., Skočir, Z.: Integrating XML Sources into a Data Warehouse. In: Lee, J., Shim, J., Lee, S.-g., Bussler, C., Shim, S. (eds.) DEECS 2006. LNCS, vol. 4055, pp. 133–142. Springer, Heidelberg (2006)
Yin, X., Pedersen, T.B.: Evaluating XML-extended OLAP queries based on a physical algebra. In: DOLAP. 7th Int. Workshop on Data Warehousing and OLAP, pp. 73–82. ACM, New York (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ravat, F., Teste, O., Tournier, R., Zurlfluh, G. (2007). A Conceptual Model for Multidimensional Analysis of Documents. In: Parent, C., Schewe, KD., Storey, V.C., Thalheim, B. (eds) Conceptual Modeling - ER 2007. ER 2007. Lecture Notes in Computer Science, vol 4801. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75563-0_37
Download citation
DOI: https://doi.org/10.1007/978-3-540-75563-0_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75562-3
Online ISBN: 978-3-540-75563-0
eBook Packages: Computer ScienceComputer Science (R0)