Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2851613.2851662acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

OLAP analysis of multidimensional tweet streams for supporting advanced analytics

Published: 04 April 2016 Publication History

Abstract

In this paper we propose to integrate Time-Aware Fuzzy Formal Concept Analysis theory with OLAP technology over multidimensional tweet streams in order to arrange tweets in the resulting OLAP cube within a suitable hierarchical structure of concepts (i.e., fuzzy lattice), according to their unstructured content. A microblog summarization algorithm is also introduced in order to provide subset of the tweets that best represents data of the OLAP cube according to the analysis perspective. This with the final goal of supporting advanced analytics over social media, which is becoming relevant at now. A detailed real-life and an extensive experimental analysis nicely complete our contributions.

References

[1]
S. Mansmann, N. U. Rehman, A. Weiler, and M. H. Scholl, "Discovering olap dimensions in semi-structured data", Information Systems, 44, pp. 120--133, 2014
[2]
V. Gupta, and N. Rathore, "Deriving business intelligence from unstructured data", International Journal of Information and Computation Technology, 3 (9), pp. 971--976, 2013
[3]
S. M. González, and T. d. R. L. Berbel, "Considering unstructured data for olap: a feasibility study using a systematic review", Revista de Sistemas de Informação da FSMA (14), pp. 26--35, 2014
[4]
R. Mihalcea, and A. Csomai, "Wikify!: linking documents to encyclopedic knowledge", in: Proceedings of CIKM 2007, pp. 233--242, 2007
[5]
S. Bringay, N. Béchet, F. Bouillot, P. Poncelet, M. Roche, and M. Teisseire, "Towards an on-line analysis of tweets processing", in: Proceedings of DEXA 2011, pp. 154--161, 2011
[6]
B. Fortuna, M. Grobelnik, D. Mladenic, OntoGen: semi-automatic ontology editor, Springer Berlin Heidelberg, pp. 309--318, 2007
[7]
P. Buitelaar, D. Olejnik, and M. Sintek, "A Protégé plug-in for ontology extraction from text based on linguistic analysis", in: The Semantic Web: Research and Applications, Springer Berlin Heidelberg, pp. 31--44, 2004
[8]
W. C. Cho and D. Richards. "Ontology construction and concept reuse with formal concept analysis for improved web document retrieval", Web Intelli. and Agent Sys., 5(1), pp. 109--126, 2007
[9]
J.-h. Yeh, and N. Yang, "Ontology construction based on latent topic extraction in a digital library", in: Digital Libraries: Universal and Ubiquitous Access to Information, Springer, 2008, pp. 93--103
[10]
W. Wang, P. Barnaghi, and A. Bargiela, "Probabilistic topic models for learning terminological ontologies", IEEE Transactions on Knowledge and Data Engineering, 22 (7), pp. 1028--1040, 2010
[11]
B. Ganter, and R. Wille, Formal Concept Analysis: Mathematical Foundations, 1st Edition, Springer-Verlag New York, Inc., Secaucus, NJ, USA, 1997
[12]
C. De Maio, G. Fenza, V. Loia, and S. Senatore, "Hierarchical web resources retrieval by exploiting fuzzy formal concept analysis", Inf. Process. Manage., 48 (3), pp. 399--418, 2012
[13]
Y. Miao, and C. Li, "Enhancing query-oriented summarization based on sentence wikification", in: Proceedings of FGSIR 2010, p. 32, 2010
[14]
B. Mustapha, L. Sabine, and O. Youcef, "Automatic textual aggregation approach of scientific articles in OLAP context", in: Proceedings of INNOVATIONS 2014, pp. 30--35, 2014
[15]
C. X. Lin, B. Ding, J. Han, F. Zhu, and B. Zhao, "Text cube: Computing IR measures for multidimensional text database analysis", in: Proceedings of ICDM 2008, pp. 905--910, 2008
[16]
L. Hannachi, N. Benblidia, F. Bentayeb, and O. Boussaid, "Social microblogging cube", in: Proceedings of DOLAP 2013, pp. 19--26, 2013
[17]
S. Bringay, N. Béchet, F. Bouillot, P. Poncelet, M. Roche, and M. Teisseire, "Towards an on-line analysis of tweets processing", in: Proceedings of DEXA 2011, pp. 154--161, 2011
[18]
N. U. Rehman, A. Weiler, and M. H. Scholl, "OLAPing social media: the case of twitter", in: Proceedings of ASONAM 2013, pp. 1139--1146, 2013
[19]
T. G. Penkova, and A. V. Korobko, "Constructing the integral OLAP-model based on formal concept analysis", in: Proceedings of MIPRO 2011, pp. 1544--1548, 2011
[20]
T. Penkova, and A. Korobko, "Constructing the integral olap-model for scientific activities based on FCA", in: Knowledge Engineering, Machine Learning and Lattice Computing with Applications, Springer, 2013, pp. 163--170.
[21]
L. Oukid, O. Asfari, F. Bentayeb, N. Benblidia, and O. Boussaid, "Cxt-cube: contextual text cube model and aggregation operator for text olap", in: Proceedings ofDOLAP 2013, pp. 27--32, 2013
[22]
G. Stumme, "Efficient data mining based on formal concept analysis," in: Proceedings of DEXA 2002, pp. 534--546, 2002
[23]
K. E. Wolff, "States, transitions, and life tracks in temporal concept analysis", in: Formal Concept Analysis, Springer, pp. 127--148, 2005
[24]
A. Cuzzocrea, D. Saccà, and J. D. Ullman, "Big data: a research agenda", in: Proceedings of IDEAS 2013, pp. 198--203, 2013
[25]
C. De Maio, G. Fenza, V. Loia, and M. Parente, "Time Aware Knowledge Extraction for Microblog Summarization on Twitter", Information Fusion, pp. 1628--1635, 2015
[26]
M. Napoli, M. Parente, and A. Peron, "Specification and Verification of Protocols With Time Constraints", Electr. Notes Theor. Comput. Sci., vol. 99, pp. 205--227, 2004
[27]
A. Cuzzocrea, "Accuracy Control in Compressed Multidimensional Data Cubes for Quality of Answer-based OLAP Tools", in: Proceedings of SSDBM 2006, pp. 301--310, 2006
[28]
A. Cuzzocrea, "Improving range-sum query evaluation on data cubes via polynomial approximation", Data Knowl. Eng., 56(2), pp. 85--121, 2006
[29]
J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H. Pirahesh, "Data Cube: A Relational Aggregation Operator Generalizing Group-by, Cross-Tab, and Sub Totals", Data Mining and Knowledge Discovery, vol. 1, no. 1, pp. 29--53, 1997
[30]
K. Morfonios and G. Koutrika, "OLAP Cubes for Social Searches: Standing on the Shoulders of Giants?", in: Proceedings of WebDB, 2008
[31]
F. C. T. Chua and S. Asur, "Automatic summarization of events from social media", in Proceedings ICWSM 2013, 2013
[32]
T. MÃijhlbauer, W. RÃudiger, A. Reiser, A. Kemper, and T. Neumann, "ScyPer: A Hybrid OLTP&OLAP Distributed Main Memory Database System for Scalable Real-Time Analytics", in: Proceedings of BTW 2013, pp. 499--502, 2013
[33]
A. Cuzzocrea, "Analytics over Big Data: Exploring the Convergence of Data Warehousing, OLAP and Data-Intensive Cloud Infrastructures", in: Proceedings of COMPSAC 2013, pp. 481--483, 2013
[34]
L. V. S. Lakshmanan, J. Pei, and Y. Zhao, "QC-Trees: An Efficient Summary Structure for Semantic OLAP", in Proceedings of SIGMOD Conference 2003, pp. 64--75, 2003
[35]
A. Cuzzocrea, I.-Y. Song, and K. C. Davis, "Analytics over large-scale multidimensional data: the big data revolution!", in: Proceedings of DOLAP 2011, pp. 101--104, 2011
[36]
A. Cuzzocrea, and I.-Y. Song, "Big Graph Analytics: The State of the Art and Future Research Agenda", in: Proceedings of DOLAP 2014, pp. 99--101, 2014
[37]
A. Cuzzocrea, C. De Maio, G. Fenza, V. Loia, and M. Parente, "Towards OLAP Analysis of Multidimensional Tweet Streams", in: Proceedings of DOLAP 2015, pp. 69--73, 2015
[38]
M. Franklin, "Making Sense of Big Data with the Berkeley Data Analytics Stack", in: Proceedings of WSDM 2014, pp. 1--2, 2014
[39]
A. Cuzzocrea, V. Russo, Domenico Saccà, "A Robust Sampling-Based Framework for Privacy Preserving OLAP", in: Proceedings of DaWaK 2008, pp. 97--114, 2008
[40]
A. Cuzzocrea, Domenico Saccà, "Balancing Accuracy and Privacy of OLAP Aggregations on Data Cubes", in: Proceedings of DOLAP 2010, pp. 93--98, 2010
[41]
T. Munger, S. Desa, and C. Wong, "The Use of Domain Knowledge Models for Effective Data Mining of Unstructured Customer Service Data in Engineering Applications", in: Proceedings of BigDataService 2015, pp. 427--438, 2015
[42]
K. Kolomvatsos, C. Anagnostopoulos, and S. Hadjiefthymiades, "An Efficient Time Optimized Scheme for Progressive Analytics in Big Data", Big Data Research 2(4), pp. 155--165, 2015
[43]
V. Kantere, and M. Filatov, "A Workflow Model for Adaptive Analytics on Big Data", in: Proceedings of BigData Congress 2015, pp. 673--676, 2015
[44]
F. Zhang, M. Liu, F. Gui, W. Shen, A. Shami, and Y. Ma, "A distributed frequent itemset mining algorithm using Spark for Big Data analytics", Cluster Computing, 18(4), pp. 1493--1501, 2015
[45]
T. Jung, X.-Y. Li, J. Han, "A Framework for Optimization in Big Data: Privacy-Preserving Multi-agent Greedy Algorithm", in: Proceedings of BigCom 2015, pp. 88--102, 2015

Cited By

View all
  • (2024)Data Warehouse Design to Support Social Media Analysis: The Case of Twitter and FacebookIntelligent Systems Design and Applications10.1007/978-3-031-64779-6_21(218-233)Online publication date: 25-Jul-2024
  • (2023)Conceptual modeling of big data SPJ operations with Twitter social mediumSocial Network Analysis and Mining10.1007/s13278-023-01112-w13:1Online publication date: 21-Aug-2023
  • (2022)Twitter Data Warehouse and Business Intelligence Using Dimensional Model and Data Mining2022 IEEE International Conference of Computer Science and Information Technology (ICOSNIKOM)10.1109/ICOSNIKOM56551.2022.10034904(1-6)Online publication date: 19-Oct-2022
  • Show More Cited By

Index Terms

  1. OLAP analysis of multidimensional tweet streams for supporting advanced analytics

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SAC '16: Proceedings of the 31st Annual ACM Symposium on Applied Computing
    April 2016
    2360 pages
    ISBN:9781450337397
    DOI:10.1145/2851613
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 April 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. OLAP analysis of multidimensional tweets
    2. OLAP analytics
    3. OLAP over social media
    4. big data analytics

    Qualifiers

    • Research-article

    Conference

    SAC 2016
    Sponsor:
    SAC 2016: Symposium on Applied Computing
    April 4 - 8, 2016
    Pisa, Italy

    Acceptance Rates

    SAC '16 Paper Acceptance Rate 252 of 1,047 submissions, 24%;
    Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 20 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Data Warehouse Design to Support Social Media Analysis: The Case of Twitter and FacebookIntelligent Systems Design and Applications10.1007/978-3-031-64779-6_21(218-233)Online publication date: 25-Jul-2024
    • (2023)Conceptual modeling of big data SPJ operations with Twitter social mediumSocial Network Analysis and Mining10.1007/s13278-023-01112-w13:1Online publication date: 21-Aug-2023
    • (2022)Twitter Data Warehouse and Business Intelligence Using Dimensional Model and Data Mining2022 IEEE International Conference of Computer Science and Information Technology (ICOSNIKOM)10.1109/ICOSNIKOM56551.2022.10034904(1-6)Online publication date: 19-Oct-2022
    • (2022)Data warehouse building to support opinion analysis in social mediaSocial Network Analysis and Mining10.1007/s13278-022-00960-212:1Online publication date: 1-Sep-2022
    • (2022)CORE-BCD-mAI: A Composite Framework for Representing, Querying, and Analyzing Big Clinical Data by Means of Multidimensional AI ToolsHybrid Artificial Intelligent Systems10.1007/978-3-031-15471-3_16(175-185)Online publication date: 5-Sep-2022
    • (2021)Lyapunov Central Limit Theorem: Theoretical Properties and Applications in Big-Data-Populated Smart City SettingsProceedings of the 2021 5th International Conference on Cloud and Big Data Computing10.1145/3481646.3481652(34-38)Online publication date: 13-Aug-2021
    • (2021)SeDaSOMA: A Framework for Supporting Serendipitous, Data-As-A-Service-Oriented, Open Big Data Management and AnalyticsProceedings of the 2021 5th International Conference on Cloud and Big Data Computing10.1145/3481646.3481647(1-7)Online publication date: 13-Aug-2021
    • (2021)Distributed Big Data Computing for Supporting Predictive Analytics of Service Requests2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC51774.2021.00257(1723-1728)Online publication date: Jul-2021
    • (2021)Enhancing LSTM Prediction of Vehicle Traffic Flow Data via Outlier Correlations2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC51774.2021.00039(210-217)Online publication date: Jul-2021
    • (2021)Big Data Lakes: Models, Frameworks, and Techniques2021 IEEE International Conference on Big Data and Smart Computing (BigComp)10.1109/BigComp51126.2021.00010(1-4)Online publication date: Jan-2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media