Leadership discovery when data correlatively evolve

Di Wu¹,
Yiping Ke¹,
Jeffrey Xu Yu¹,
Philip S. Yu² &
…
Lei Chen³

357 Accesses
10 Citations
Explore all metrics

Abstract

Nowadays, World Wide Web is full of rich information, including text data, XML data, multimedia data, time series data, etc. The web is usually represented as a large graph and PageRank is computed to rank the importance of web pages. In this paper, we study the problem of ranking evolving time series and discovering leaders from them by analyzing lead-lag relations. A time series is considered to be one of the leaders if its rise or fall impacts the behavior of many other time series. At each time point, we compute the lagged correlation between each pair of time series and model them in a graph. Then, the leadership rank is computed from the graph, which brings order to time series. Based on the leadership ranking, the leaders of time series are extracted. However, the problem poses great challenges since the dynamic nature of time series results in a highly evolving graph, in which the relationships between time series are modeled. We propose an efficient algorithm which is able to track the lagged correlation and compute the leaders incrementally, while still achieving good accuracy. Our experiments on real weather science data and stock data show that our algorithm is able to compute time series leaders efficiently in a real-time manner and the detected leaders demonstrate high predictive power on the event of general time series entities, which can enlighten both weather monitoring and financial risk control.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Langville, A.N., Meyer, C.D.: Google’s PageRank and Beyond: The Science of Search Engine Rankings. Princeton University Press (2006)
Bhuyan, R.: Information, alternative markets, and security price processes: a survey of literature. Finance 0211002, EconWPA (2002)
Box, G., Jenkins, G.M., Reinsel, G.: Time Series Analysis: Forecasting and Control. Prentice Hall (1994)
Brennan, M.J., Jegadeesh, N., Swaminathan, B.: Investment analysis and the adjustment of stock prices to common information. Rev. Financ. Stud. 6(4), 799–824 (1993)
Article Google Scholar
Brent, R.P.: Algorithms for Minimization Without Derivatives. Dover Publications (2002)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30(1–7), 107–117 (1998)
Article Google Scholar
Campbell, J.Y., Grossman, S.J., Wang, J.: Trading volume and serial correlation in stock returns. Q. J. Econ. 108(4), 905–939 (1993)
Article Google Scholar
Chan, K.: A further analysis of the lead-lag relationship between the cash market and stock index futures market. Rev. Financ. Stud. 5(1), 123–152 (1992)
Article Google Scholar
Corso, G.M.D., Gullí, A., Romani, F.: Ranking a stream of news. In: WWW ’05: Proceedings of the 14th International Conference on World Wide Web, pp. 97–106. ACM, New York (2005)
Chapter Google Scholar
Dorr, D.H., Denton, A.M.: Establishing relationships among patterns in stock market data. In: Data & Knowledge Engineering (2008)
Douglis, F., Ball, T., Chen, Y.-F., Koutsofios, E.: The AT&T internet difference engine: tracking and viewing changes on the web. World Wide Web 1(1), 27–44 (1998)
Article Google Scholar
Granger, C.W.J.: Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37(3), 424–38 (1969)
Article MathSciNet Google Scholar
Greco, G., Greco, S., Zumpano, E.: A probabilistic approach for distillation and ranking of web pages. World Wide Web 4(3), 189–207 (2001)
Article MATH Google Scholar
Gruhl, D., Guha, R., Kumar, R., Novak, J., Tomkins, A.: The predictive power of online chatter. In: KDD, pp. 78–87. ACM, New York (2005)
Google Scholar
Gruhl, D., Guha, R., Liben-Nowell, D., Tomkins, A.: Information diffusion through blogspace. In: WWW, pp. 491–501. ACM, New York (2004)
Chapter Google Scholar
Idé, T., Kashima, H.: Eigenspace-based anomaly detection in computer systems. In: KDD, pp. 440–449 (2004)
Idé, T., Papadimitriou, S., Vlachos, M.: Computing correlation anomaly scores using stochastic nearest neighbors. In: ICDM, pp. 523–528
Kontaki, M., Papadopoulos, A.N., Manolopoulos, Y.: Continuous subspace clustering in streaming time series. Inf. Syst. 33(2), 240–260 (2008)
Article Google Scholar
Kumar, R., Novak, J., Raghavan, P., Tomkins, A.: On the bursty evolution of blogspace. World Wide Web 8(2), 159–178 (2005)
Article Google Scholar
Meijering, E.: Chronology of interpolation: From ancient astronomy to modern signal and image processing. In: Proc. of the IEEE, pp. 319–342 (2002)
Nie, Z., Zhang, Y., Wen, J.-R., Ma, W.-Y.: Object-level ranking: bringing order to web objects. In: WWW, pp. 567–574 (2005)
Papadimitriou, S., Sun, J., Yu, P.S.: Local correlation tracking in time series. In: ICDM, pp. 456–465 (2006)
Pirolli, P., Pitkow, J.E.: Distributions of surfers’ paths through the world wide web: Empirical characterizations. World Wide Web 2(1–2), 29–45 (1999)
Article Google Scholar
Pitkow, J.E.: Summary of www characterizations. World Wide Web 2(1–2), 3–13 (1999)
Article Google Scholar
Säfvenblad, P.: Lead-lag effects when prices reveal cross-security information. Working Paper Series in Economics and Finance 189. Stockholm School of Economics (1997)
Sakurai, Y., Papadimitriou, S., Faloutsos, C.: Braid: stream mining through group lag correlations. In: SIGMOD, pp. 599–610 (2005)
Steinbach, M., Tan, P.-N., Kumar, V., Klooster, S.A., Potter, C.: Discovery of climate indices using clustering. In: KDD, pp. 446–455 (2003)
Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley (2006)
von Storch, H., Zwiers, F.W.: Statistical Analysis in Climate Research. Cambridge University Press (2002)
Wang, Q., Megalooikonomou, V.: A dimensionality reduction technique for efficient time series similarity analysis. Inf. Syst. 33(1), 115–132 (2008)
Article Google Scholar
Wichard, J.D., Merkwirth, C., Ogorzałlek, M.: Detecting correlation in stock market. Physica, A 344(1–2), 308–311 (2004)
Article MathSciNet Google Scholar
Wu, D., Ke, Y., Yu, J.X., Yu, P.S., Chen, L.: Detecting leaders from correlated time series. In: DASFAA, pp. 352–367 (2010)
Zhu, Y., Shasha, D.: Statstream: statistical monitoring of thousands of data streams in real time. In: VLDB, pp. 358–369 (2002)

Download references

Author information

Authors and Affiliations

The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
Di Wu, Yiping Ke & Jeffrey Xu Yu
University of Illinois at Chicago, Chicago, IL, USA
Philip S. Yu
The Hong Kong University of Science and Technology, Kowloon, Hong Kong
Lei Chen

Authors

Di Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yiping Ke
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey Xu Yu
View author publications
You can also search for this author in PubMed Google Scholar
Philip S. Yu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yiping Ke.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, D., Ke, Y., Yu, J.X. et al. Leadership discovery when data correlatively evolve. World Wide Web 14, 1–25 (2011). https://doi.org/10.1007/s11280-010-0095-z

Download citation

Received: 18 January 2010
Revised: 17 June 2010
Accepted: 21 June 2010
Published: 10 July 2010
Issue Date: January 2011
DOI: https://doi.org/10.1007/s11280-010-0095-z

Leadership discovery when data correlatively evolve

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A scalable framework for large time series prediction

A K-Motifs Discovery Approach for Large Time-Series Data Analysis

RankBrushers: interactive analysis of temporal ranking ensembles

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Leadership discovery when data correlatively evolve

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A scalable framework for large time series prediction

A K-Motifs Discovery Approach for Large Time-Series Data Analysis

RankBrushers: interactive analysis of temporal ranking ensembles

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation