Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3632410.3632479acmotherconferencesArticle/Chapter ViewAbstractPublication PagescomadConference Proceedingsconference-collections
demonstration

STRCUBE - An architecture and Reference Implementation for Streaming Star Schema

Published: 04 January 2024 Publication History

Abstract

Star Schema has been used in data warehouses for creating multi-dimensional models of data for analytics. OLAP operations are performed on the dimensional model for exploratory analytics. Streaming data poses special challenges when managed using traditional database models like vanilla relational data model. The notion of continuous queries and unbounded storage goes against the principles of traditional database architecture. In this paper, we propose STRCUBE, which is an architecture for implementing star schema where the fact table is not a normally stored table but is streaming data. The architecture supports OLAP operations such as slice, dice, rollup, drill down on streaming fact tables. The unique contribution of the work is that the architecture is not tied to any specific streaming platform or bigdata platform. The paper also provides a reference implementation with results from the domain of stock market data.

References

[1]
Tyler Akidau, Robert Bradshaw, Craig Chambers, Slava Chernyak, Rafael J. Fernandez-Moctezuma, Reuven LAX, Sam McVeety, Daniel Mills, Frances Perry, Eric Schmidt, and Sam Whittle. 2015. The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing. Proceedings of the VLDB Endowment 2150-8097/15/08 8, 12 (2015).
[2]
Tyler Akidau, Slava Chernyak, and Reuven Lax. 2018. Streaming Systems, The What, Where, When, and How of Large-Scale Data Processing (1st ed.). O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. chapters 2-5.
[3]
Sandro Bimonte, Enrico Gallinucci, Patrick Marcel, and Stefano Rizzi. 2022. Logical design of multi-model data warehouses. Knowledge and Information Systems (2023) (2022). https://doi.org/65:1067–1103
[4]
Senda Bouaziz, Ahlem Nabli, and Faiez Gargouri. 2017. From Traditional Data Warehouse To Real Time Data Warehouse. Springer International Publishing AG (2017). https://doi.org/10.1007/978-3-319-53480-0 46
[5]
Matteo Golfarelli and Stefano Rizzi. 1999. A Methodological Framework for Data Warehouse Design. DOLAP ‘98 Washington DC USA, ACM (1999).
[6]
Jiawei Han, Yixin Chen, Guozhu Dong, Jian Pei, Benjamin W. Wah, Jianyong Wang, and Y. Dora Cai. 2005. Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams. Springer Science + Business Media 18, 173–197, 2005 (2005). https://doi.org/10.1007/s10619-005-3296-1
[7]
M. Zafar Iqbal, Ghulam Mustafa, Nadeem Sarwar, Syed Hamza Wajid, Junaid Nasir, and Shaista Siddque. 2020. A Review of Star Schema and Snowflakes Schema. Springer Singapore.
[8]
Rajni Jindal and Shweta Taneja. 2012. Comparative study of Data warehouse design approaches: A Survey. International Journal of Database Management Systems (IJDMS) 4, 1 (February 2012). https://doi.org/10.5121/ijdms.2012.4104
[9]
Ralph K., Margy R., Thornthwaite W., Mundy J., and & Becker B.2013. The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling (3rd edition). Wiley.
[10]
Michael Krippendorf and 11-Yeol Song. 1997. The Translation of Star Schema into Entity-Relationship Diagrams. 0-8186-8147-0/97 IEEE (1997).
[11]
Jean Gane Sarr, Ndiouma Bame, and Aliou Boly. 2022. Data Streams Management : Multidimensional Summary With Big Data Tools. IEEE - 5th International Conference on Computing and Big Data. 978-1-6654-5716-3/22. https://doi.org/10.1109/ICCBD56965.2022.10080310
[12]
S. Saxena and S. Gupta. 2017. Practical Real-time Data Processing and Analytics: Distributed Computing and Event Processing using Apache Spark, Flink, Storm, and Kafka.Packt Publishing.
[13]
Michael Scriney, Martin F. O’ Connor, and Mark Roantree. 2017. Generating Cubes from Smart City Web Data. ACSW ’17, January 31-February 03, 2017, Geelong, Australia (2017). https://doi.org/10.1109/ICCBD56965.2022.10080310

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
CODS-COMAD '24: Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD)
January 2024
627 pages
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 January 2024

Check for updates

Author Tags

  1. OLAP
  2. analytics
  3. big data
  4. data warehouse
  5. star schema
  6. stream cube
  7. stream management

Qualifiers

  • Demonstration
  • Research
  • Refereed limited

Conference

CODS-COMAD 2024

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 48
    Total Downloads
  • Downloads (Last 12 months)48
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media