Abstract
Designing a multidimensional model is a non-trivial task: the requirements collected from senior managers can be inaccurate and difficult to use, especially when they are not database practitioners. On the other hand the pressing need for data warehousing comes from the fact that managers submit analytical summary queries against operational relational databases. In this paper, we view these analytical queries as a more reliable basis for multidimensional schema design. We first study how to use such queries to automatically generate measures, dimensions and dimension hierarchies and their representation in a star schema. We then use the results studying the schema evolution problem: If a new query Q′ can’t be answered by the schema previously built, how to develop a new schema from the old one so that query Q′ can be answered in the new schema. Finally, we consider rewrite OLAP queries on conventional database (multidimensional data warehouse) using materialized views in data warehouse. In our method we analyze the attributes from the given relational queries and the dependency relationships among them to design a schema in which each dimension is a lattice. The schema can answer the queries that it was generated from and the number of dimensions is minimal. Furthermore, the schema can answer many more queries that are similar to the given ones.
Similar content being viewed by others
Notes
If t1[A] is null, then t1[A]≠t2[A] whether or not t2[A] is null.
We assume the MDs have had binding transformations applied.
We can also use Property 1 to obtain \(\mathcal {G}_{i_{h}}\) here. Refer to Algorithm 1.
References
Agrawal, S., Chaudhuri, S., Narasayya, V.: Materialized view and index selection tool for Microsoft SQL server 2000. In: Proc. of the Materialized ACM SIGMOD Conf., p. 608. ACM Press (2001)
Betari, O., Erramdani, M., Arrhioui, K.: A Model Driven Architecture Approach to Generate Multidimensional Schemas of Data Warehouses. Transactions on Machine learning and Artificial Intelligence 5(4), 174–182 (2017)
Cheung, D.W., Zhou, B., Kao, B., Lu, H., Lam, T.W., Ting, H.F.: Requirement-based data cube schema design. In: Proc. of the 8th Intl. Conf. on Information and Knowledge Management, pp. 162–169. ACM Press (1999)
Chirkova, R., Halevy, A.Y., Suciu, D.: A formal perspective on the view selection problem. VLDB J. 11(3), 216–237 (2002)
Hachaichi, Y., Feki, J.: An automatic method for the design of multidimensional schemas from object oriented databases. Int. J. Inf. Technol. Decis. Mak. 12(6), 1223–1259 (2013)
Harinarayan, V., Rajaraman, A., Ullman, J.D.: Implementing data cubes efficiently. In: Proc. of the ACM SIGMOD Conf., pp. 205–216. ACM Press (1996)
Hurtado, C.A., Mendelzon, A.O.: Olap dimension constraints. In: Proc. of the Twenty-First ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 169-179. ACM Press (2002)
Husemann, B., Lechtenborger, J., Vossen, G.: Conceptual data warehouse modeling. In: Design and Management of Data Warehouses, p. 6 (2000)
Kimball, R.: The Data Warehouse Toolkit. Wiley (1996)
Lechtenbger, J., Vossen, G.: Multidimensional normal forms for data warehouse design. Inf. Syst. 28(5), 415–434 (2003)
Lehner, W., Albrecht, J., Wedekind, H.: Normal forms for multidimensional databases. In: 10th Intl Conf. on Scientific and Statistical Database Management, Capri, Italy, July 1-3, 1998, pp. 63-72. IEEE Computer Society (1998)
Mistry, H., Roy, P., Sudarshan, S., Ramamritham, K.: Materialized view selection and maintenance using multi-query optimization. In: Proc. of the Materialized View ACM SIGMOD Conf., pp. 307–318. ACM Press (2001)
Pang, C., Taylor, K., Zhang, X.: Multidimensional schema evolution from aggregation queries. CSIRO ICT Centre, Technical Report, 03(186) (2003)
Saraiya, Y.: Subtree Elimination Algorithms in Deductive Databases. PhD thesis, Stanford University (1991)
Uchiyama, H., Runapongsa, K., Teorey, T.J.: A progressive view materialization algorithm. In: Proc. of the Second ACM International Workshop on Data Warehousing and OLAP, pp. 36–41 (1999)
Ullman, J.D.: Principles of Database and Knowledge-base Systems, Vol I and Vol II. Computer Science Press (1989)
Willenborg, L., de Waal, T.: Elements of Statistical Disclosure Control, Volume 155 of Lecture Notes in Statistics. Springer (2001)
Acknowledgements
This paper is partially supported by Education technology foundation of the Ministry of Education of China (No.2017A01020), Scientific research project of Hebei Education Department of China (No. ZD2018205).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Huo, Z., Taylor, K., Zhang, X. et al. Generating multidimensional schemata from relational aggregation queries. World Wide Web 23, 337–359 (2020). https://doi.org/10.1007/s11280-019-00706-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-019-00706-9