Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/978-3-642-34002-4_3guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Of cubes, DAGs and hierarchical correlations: a novel conceptual model for analyzing social media data

Published: 15 October 2012 Publication History

Abstract

With the advent of social media there is an ever increasing amount of unstructured data that can be analyzed to obtain insights. Two prominent examples are sentiment analysis and the discovery of correlated concepts. A convenient representation of information in such scenarios is in terms of concepts extracted from the unstructured data, and measures, such as sentiment scores, associated with these concepts. Typically, social media analysis reports these concepts and their associated measures. We argue that much richer insights can be obtained through the use of OLAP-style multidimensional analysis. It is fairly straightforward to see how to add traditional dimension hierarchies such as time and geography, and to analyze the data along these dimensions using traditional OLAP operations such as roll-up; for instance, to answer queries of the form "What was the average sentiment for X in Europe during the past month?" However, it is trickier to answer queries of the form "What was the average sentiment for concepts related to X in Europe during the past month?" We introduce a conceptual modeling framework that extends traditional multidimensional models and OLAP operators to address the new set of requirements for data extracted from social media. In this model, we organize data along both traditional dimensions (we call these metadata dimensions) and concept dimensions, which model relationships among concepts using parent-child hierarchies. Specifically: (i) we allow operations on parent-child hierarchies to be treated in a uniform way as operations on traditional dimension hierarchies; (ii) to model the rich relationships that can exist among concepts, we extend the parent-child hierarchies to be rooted level-DAGs rather than simply trees; and (iii) we introduce new equivalence classes that allow us to reason with "similar" concepts in new ways. We show that our modeling and operator framework facilitates multidimensional analysis to gain further insights from social media data than is possible with existing methods.

References

[1]
Arasu, A., Ganti, V., Kaushik, R.: Efficient exact set-similarity joins. In: Proceedings of the 32nd International Conference on Very Large Data Bases, VLDB 2006, pp. 918-929 (2006).
[2]
Beyer, K., Ramakrishnan, R.: Bottom-up computation of sparse and iceberg cube. SIGMOD Rec. 28(2), 359-370 (1999).
[3]
Castellanos, M., Dayal, U., Hsu, M., Ghosh, R., Dekhil, M., Lu, Y., Zhang, L., Schreiman, M.: Lci: a social channel analysis platform for live customer intelligence. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, pp. 1049-1058 (2011).
[4]
Chaudhuri, S., Ganti, V., Kaushik, R.: A primitive operator for similarity joins in data cleaning. In: Proceedings of the 22nd International Conference on Data Engineering, ICDE 2006 (2006).
[5]
Gravano, L., Ipeirotis, P.G., Jagadish, H.V., Koudas, N., Muthukrishnan, S., Srivastava, D.: Approximate string joins in a database (almost) for free. In: Proceedings of the 27th International Conference on Very Large Data Bases, VLDB 2001, pp. 491-500 (2001).
[6]
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Min. Knowl. Discov. 1(1), 29-53 (1997).
[7]
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers (2006).
[8]
Lin, C., Ding, B., Han, J., Zhu, F., Zhao, B.: Text cube: Computing ir measures for multidimensional text database analysis. In: ICDM 2008, pp. 905-910 (2008).
[9]
Malinowski, E., Zimányi, E.: OLAP Hierarchies: A Conceptual Perspective. In: Persson, A., Stirna, J. (eds.) CAiSE 2004. LNCS, vol. 3084, pp. 477-491. Springer, Heidelberg (2004).
[10]
Malinowski, E., Zimányi, E.: Hierarchies in a multidimensional model: from conceptual modeling to logical representation. Data Knowl. Eng. 59(2), 348-377 (2006).
[11]
Navarro, G.: A guided tour to approximate string matching. ACM Comput. Surv. 33(1), 31- 88.
[12]
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1-2), 1-135 (2008).
[13]
Sarawagi, S., Kirpal, A.: Efficient set joins on similarity predicates. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, SIGMOD 2004, pp. 743-754 (2004).
[14]
Xiao, C., Wang, W., Lin, X., Yu, J.X.: Efficient similarity joins for near duplicate detection. In: Proceedings of the 17th International Conference on World Wide Web, WWW2008, pp. 131-140 (2008).
[15]
Zhang, D., Zhai, C., Han, J., Srivastava, A., Oza, N.: Topic modeling for olap on multidimensional text databases: topic cube and its applications. Stat. Anal. Data Min. 2(56), 378-395 (2009).

Cited By

View all
  • (2018)Advanced topic modeling for social business intelligenceInformation Systems10.1016/j.is.2015.04.00553:C(87-106)Online publication date: 30-Dec-2018
  • (2014)A methodology for social BIProceedings of the 18th International Database Engineering & Applications Symposium10.1145/2628194.2628250(207-216)Online publication date: 7-Jul-2014
  • (2013)Meta-starsProceedings of the sixteenth international workshop on Data warehousing and OLAP10.1145/2513190.2513195(11-18)Online publication date: 28-Oct-2013
  1. Of cubes, DAGs and hierarchical correlations: a novel conceptual model for analyzing social media data

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    ER'12: Proceedings of the 31st international conference on Conceptual Modeling
    October 2012
    591 pages
    ISBN:9783642340017
    • Editors:
    • Paolo Atzeni,
    • David Cheung,
    • Sudha Ram

    Sponsors

    • Springer
    • Universitàdegli Studi Di Brescia: Universitàdegli Studi Di Brescia
    • Universita della Calabria, Rende(CS), Italy
    • Università degli Studi di Milano: Università degli Studi di Milano
    • HP: HP

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 15 October 2012

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 18 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)Advanced topic modeling for social business intelligenceInformation Systems10.1016/j.is.2015.04.00553:C(87-106)Online publication date: 30-Dec-2018
    • (2014)A methodology for social BIProceedings of the 18th International Database Engineering & Applications Symposium10.1145/2628194.2628250(207-216)Online publication date: 7-Jul-2014
    • (2013)Meta-starsProceedings of the sixteenth international workshop on Data warehousing and OLAP10.1145/2513190.2513195(11-18)Online publication date: 28-Oct-2013

    View Options

    View options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media