Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2736277.2741092acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

The Web as a Jungle: Non-Linear Dynamical Systems for Co-evolving Online Activities

Published: 18 May 2015 Publication History

Abstract

Given a large collection of co-evolving online activities, such as searches for the keywords "Xbox", "PlayStation" and "Wii", how can we find patterns and rules? Are these keywords related? If so, are they competing against each other? Can we forecast the volume of user activity for the coming month? We conjecture that online activities compete for user attention in the same way that species in an ecosystem compete for food. We present ECOWEB, (i.e., Ecosystem on the Web), which is an intuitive model designed as a non-linear dynamical system for mining large-scale co-evolving online activities. Our second contribution is a novel, parameter-free, and scalable fitting algorithm, ECOWEB-FIT, that estimates the parameters of ECOWEB. Extensive experiments on real data show that ECOWEB is effective, in that it can capture long-range dynamics and meaningful patterns such as seasonalities, and practical, in that it can provide accurate long-range forecasts. ECOWEB consistently outperforms existing methods in terms of both accuracy and execution speed.

References

[1]
C. C. Aggarwal. The setwise stream classification problem. In KDD, pages 432--441, 2014.
[2]
R. M. Anderson and R. M. May. Infectious Diseases of Humans Dynamics and Control. Oxford University Press, 1992.
[3]
A. Beutel, B. A. Prakash, R. Rosenfeld, and C. Faloutsos. Interacting viruses in networks: can both survive? In KDD, pages 426--434, 2012.
[4]
C. Böhm, C. Faloutsos, J.-Y. Pan, and C. Plant. Ric: Parameter-free noise-robust clustering. TKDD, 1(3), 2007.
[5]
C. Böhm, C. Faloutsos, and C. Plant. Outlier-robust clustering using independent components. In SIGMOD, pages 185--198, 2008.
[6]
F. Brauer and C. Castillo-Chavez. Mathematical models in population biology and epidemiology, volume 40. Springer Verlag, New York, 2001.
[7]
D. Chakrabarti, S. Papadimitriou, D. S. Modha, and C. Faloutsos. Fully automatic cross-associations. In KDD, pages 79--88, 2004.
[8]
H. Choi and H. R. Varian. Predicting the present with google trends. The Economic Record, 88(s1):2--9, 2012.
[9]
I. N. Davidson, S. Gilpin, O. T. Carmichael, and P. B. Walker. Network discovery via constrained tensor analysis of fmri data. In KDD, pages 194--202, 2013.
[10]
M. Eirinaki and M. Vazirgiannis. Web mining for web personalization. ACM Trans. Internet Techn., 3(1):1--27, 2003.
[11]
J. Ferlez, C. Faloutsos, J. Leskovec, D. Mladenic, and M. Grobelnik. Monitoring network evolution using MDL. In ICDE, pages 1328--1330.
[12]
F. Figueiredo, J. M. Almeida, Y. Matsubara, B. Ribeiro, and C. Faloutsos. Revisit behavior in social media: The phoenix-r model and discoveries. In PKDD, pages 386--401, 2014.
[13]
J. Ginsberg, M. Mohebbi, R. Patel, L. Brammer, M. Smolinski, and L. Brilliant. Detecting influenza epidemics using search engine query data. Nature, 457:1012--1014, 2009.
[14]
S. Goel, J. Hofman, S. Lahaie, D. Pennock, and D. Watts. Predicting consumer behavior with web search. PNAS, 2010.
[15]
D. Gruhl, R. Guha, R. Kumar, J. Novak, and A. Tomkins. The predictive power of online chatter. In KDD, pages 78--87, 2005.
[16]
A. Hyvärinen and E. Oja. Independent component analysis: Algorithms and applications. Neural Netw., 13(4-5):411--430, 2000.
[17]
E. Jackson. Perspectives of Nonlinear Dynamics:. Cambridge University Press, 1992.
[18]
A. Jain, E. Y. Chang, and Y.-F. Wang. Adaptive stream resource management using kalman filters. In SIGMOD, pages 11--22, 2004.
[19]
I. Jolliffe. Principal Component Analysis. Springer Verlag, 1986.
[20]
R. L. Jr., A. Veloso, A. M. Pereira, W. M. Jr., R. Ferreira, and S. Parthasarathy. Economically-efficient sentiment stream analysis. In SIGIR, pages 637--646, 2014.
[21]
E. J. Keogh, S. Chu, D. Hart, and M. J. Pazzani. An online algorithm for segmenting time series. In ICDM, pages 289--296, 2001.
[22]
Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In KDD, pages 426--434, 2008.
[23]
F. Korn, H. V. Jagadish, and C. Faloutsos. Efficiently supporting ad hoc queries in large datasets of time sequences. In SIGMOD 1997, pages 289--300, 1997.
[24]
R. Kumar, M. Mahdian, and M. McGlohon. Dynamics of conversations. In KDD, pages 553--562, 2010.
[25]
J.-G. Lee, J. Han, and K.-Y. Whang. Trajectory clustering: a partition-and-group framework. In SIGMOD, pages 593--604, 2007.
[26]
W. Leontief. Input-output economics. Oxford University Press, 1986.
[27]
J. Leskovec, L. Backstrom, and J. M. Kleinberg. Meme-trac king and the dynamics of the news cycle. In KDD, pages 497--506, 2009.
[28]
J. Leskovec, L. Backstrom, R. Kumar, and A. Tomkins. Microscopic evolution of social networks. In KDD, pages 462--470, 2008.
[29]
K. Levenberg. A method for the solution of certain non-linear problems in least squares. Quarterly Journal of Applied Mathmatics, II(2):164--168, 1944.
[30]
L. Li, B. A. Prakash, and C. Faloutsos. Parsimonious linear fingerprinting for time series. PVLDB, 3(1):385--396, 2010.
[31]
Y. Lu, P. Tsaparas, A. Ntoulas, and L. Polanyi. Exploiting social context for review quality prediction. In WWW, pages 691--700, 2010.
[32]
Y. Matsubara, Y. Sakurai, and C. Faloutsos. Autoplait: Automatic mining of co-evolving time sequences. In SIGMOD, 2014.
[33]
Y. Matsubara, Y. Sakurai, C. Faloutsos, T. Iwata, and M. Yoshikawa. Fast mining and forecasting of complex time-stamped events. In KDD, pages 271--279, 2012.
[34]
Y. Matsubara, Y. Sakurai, B. A. Prakash, L. Li, and C. Faloutsos. Rise and fall patterns of information diffusion: model and implications. In KDD, pages 6--14, 2012.
[35]
Y. Matsubara, Y. Sakurai, W. G. van Panhuis, and C. Faloutsos. FUNNEL: automatic mining of spatially coevolving epidemics. In KDD, pages 105--114, 2014.
[36]
R. M. May. Qualitative stability in model ecosystems. Ecology, 54(3):638--641, 1973.
[37]
J. Murray. Mathematical Biology II: Spatial Models and Biomedical Applications. Intercisciplinary Applied Mathematics: Mathematical Biology. Springer, 2003.
[38]
M. Nowak. Evolutionary Dynamics. Harvard University Press, 2006.
[39]
E. Odum and G. Barrett. Fundamentals of Ecology. Thomson Brooks/Cole, 2005.
[40]
S. Papadimitriou, A. Brockwell, and C. Faloutsos. Adaptive, hands-off stream mining. In VLDB, pages 560--571, 2003.
[41]
S. Papadimitriou, J. Sun, and C. Faloutsos. Streaming pattern discovery in multiple time-series. In VLDB, pages 697--708, 2005.
[42]
S. Papadimitriou and P. S. Yu. Optimal multi-scale patterns in time series streams. In SIGMOD, pages 647--658, 2006.
[43]
B. A. Prakash, A. Beutel, R. Rosenfeld, and C. Faloutsos. Winner takes all: competing viruses or ideas on fair-play networks. In WWW, pages 1037--1046, 2012.
[44]
B. A. Prakash, D. Chakrabarti, M. Faloutsos, N. Valler, and C. Faloutsos. Threshold conditions for arbitrary cascade models on arbitrary networks. In ICDM, pages 537--546, 2011.
[45]
T. Preis, H. S. Moat, and H. E. Stanley. Quantifying trading behavior in financial markets using google trends. Sci. Rep., 3, 04 2013.
[46]
T. Rakthanmanon, B. J. L. Campana, A. Mueen, G. E. A. P. A. Batista, M. B. Westover, Q. Zhu, J. Zakaria, and E. J. Keogh. Searching and mining trillions of time series subsequences un der dynamic time warping. In KDD, pages 262--270, 2012.
[47]
B. Ribeiro. Modeling and predicting the growth and death of membership-based websites. In WWW, pages 653--664, 2014.
[48]
Y. Sakurai, S. Papadimitriou, and C. Faloutsos. Braid: Stream mining through group lag correlations. In SIGMOD, pages 599--610, 2005.
[49]
E. Shmueli, A. Kagian, Y. Koren, and R. Lempel. Care to comment?: recommendations for commenting on news stories. In WWW, pages 429--438, 2012.
[50]
J. Sun, D. Tao, and C. Faloutsos. Beyond streams and graphs: dynamic tensor analysis. In KDD, pages 374--383, 2006.
[51]
Y. Tao, C. Faloutsos, D. Papadias, and B. Liu. Prediction and indexing of moving objects with unknown motion patterns. In SIGMOD, pages 611--622, 2004.
[52]
M. Vlachos, D. Gunopulos, and G. Kollios. Discovering similar multidimensional trajectories. In ICDE, pages 673--684, 2002.
[53]
H. Wang, J. Yin, J. Pei, P. S. Yu, and J. X. Yu. Suppressing model overfitting in mining concept-drifting data streams. In KDD, pages 736--741, 2006.
[54]
P. Wang, H. Wang, and W. Wang. Finding semantics in time series. In SIGMOD Conference, pages 385--396, 2011.
[55]
J. Yang, J. J. McAuley, J. Leskovec, P. LePendu, and N. Shah. Finding progression stages in time-evolving event sequences. In WWW, pages 783--794, 2014.
[56]
R. Zafarani and H. Liu. Connecting users across social media sites: a behavioral-modeling approach. In KDD, pages 41--49, 2013.
[57]
Y. Zhao, N. Sundaresan, Z. Shen, and P. S. Yu. Anatomy of a web-scale resale market: a data mining approach. In WWW, pages 1533--1544, 2013.

Cited By

View all
  • (2023)Dynamic Causal Modelling and Predictive Analysis for the COVID-19 Pandemic2023 IEEE International Conference on Intelligence and Security Informatics (ISI)10.1109/ISI58743.2023.10297254(1-6)Online publication date: 2-Oct-2023
  • (2023)Modeling Hierarchical Seasonality Through Low-Rank Tensor Decompositions in Time Series AnalysisIEEE Access10.1109/ACCESS.2023.329859711(85770-85784)Online publication date: 2023
  • (2023)DeepVATS: Deep Visual Analytics for Time SeriesKnowledge-Based Systems10.1016/j.knosys.2023.110793277(110793)Online publication date: Oct-2023
  • Show More Cited By

Index Terms

  1. The Web as a Jungle: Non-Linear Dynamical Systems for Co-evolving Online Activities

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '15: Proceedings of the 24th International Conference on World Wide Web
    May 2015
    1460 pages
    ISBN:9781450334693

    Sponsors

    • IW3C2: International World Wide Web Conference Committee

    In-Cooperation

    Publisher

    International World Wide Web Conferences Steering Committee

    Republic and Canton of Geneva, Switzerland

    Publication History

    Published: 18 May 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. ecosystem
    2. non-linear
    3. parameter-free
    4. time-series

    Qualifiers

    • Research-article

    Funding Sources

    • JSPS
    • NFS
    • ARL

    Conference

    WWW '15
    Sponsor:
    • IW3C2

    Acceptance Rates

    WWW '15 Paper Acceptance Rate 131 of 929 submissions, 14%;
    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)24
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 21 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Dynamic Causal Modelling and Predictive Analysis for the COVID-19 Pandemic2023 IEEE International Conference on Intelligence and Security Informatics (ISI)10.1109/ISI58743.2023.10297254(1-6)Online publication date: 2-Oct-2023
    • (2023)Modeling Hierarchical Seasonality Through Low-Rank Tensor Decompositions in Time Series AnalysisIEEE Access10.1109/ACCESS.2023.329859711(85770-85784)Online publication date: 2023
    • (2023)DeepVATS: Deep Visual Analytics for Time SeriesKnowledge-Based Systems10.1016/j.knosys.2023.110793277(110793)Online publication date: Oct-2023
    • (2023)Vessel Trajectory Segmentation: A SurveyDatabase Systems for Advanced Applications. DASFAA 2023 International Workshops10.1007/978-3-031-35415-1_12(166-180)Online publication date: 17-Apr-2023
    • (2022)Simple epidemic models with segmentation can be better than complex onesPLOS ONE10.1371/journal.pone.026224417:1(e0262244)Online publication date: 12-Jan-2022
    • (2022)Mining Reaction and Diffusion Dynamics in Social ActivitiesProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557396(1521-1531)Online publication date: 17-Oct-2022
    • (2021)DIFCURV: A unified framework for Diffusion Curve Fitting and prediction in Online Social NetworksArray10.1016/j.array.2021.10010012(100100)Online publication date: Dec-2021
    • (2019)Automatic Sequential Pattern Mining in Data StreamsProceedings of the 28th ACM International Conference on Information and Knowledge Management10.1145/3357384.3358002(1733-1742)Online publication date: 3-Nov-2019
    • (2019)Concept of Keystone Species in Web SystemsProceedings of the 10th ACM Conference on Web Science10.1145/3292522.3326023(81-85)Online publication date: 26-Jun-2019
    • (2019)Dynamic Modeling and Forecasting of Time-evolving Data StreamsProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3292500.3330947(458-468)Online publication date: 25-Jul-2019
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media