Abstract
PageRank has been widely used to measure the importance of web pages based on their interconnections in the web graph. Mathematically speaking, PageRank can be explained using a Markov random walk model, in which only the direct outlinks of a page contribute to its transition probability. In this paper, we propose improving the PageRank algorithm by looking N-step ahead when constructing the transition probability matrix. The motivation comes from the similar “looking N-step ahead” strategy that is successfully used in computer chess. Specifically, we assume that if the random surfer knows the N-step outlinks of each web page, he/she can make a better decision on choosing which page to navigate for the next time. It is clear that the classical PageRank algorithm is a special case of our proposed N-step PageRank method. Experimental results on the dataset of TREC Web track show that our proposed algorithm can boost the search accuracy of classical PageRank by more than 15% in terms of mean average precision.
Preview
Unable to display preview. Download preview PDF.
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
References
Craswell, N., Hawking, D.: Overview of the TREC 2003 Web Track. In: The twelfth Text Retrieval Conference (TREC 2003) (2003)
Hsu, F.-h.: Behind Deep Blue. Princeton University Press, Princeton (2002)
Kallenberg, O.: Foundations of Modern Probability, p.152
Kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–622 (1999)
Ng, A.Y., Zheng, A.X., Jordan, M.I.: Link analysis, eigenvectors, and stability. In: Proc. 17th International Joint Conference on Artificial Intelligence (2001)
Page, L., et al.: The PageRank citation ranking: Bringing order to the web. Technical report, Stanford University, Stanford, CA (1998)
Robertson, S.E.: Overview of the okapi projects. Journal of Documentation 53(1), 3–7 (1997)
Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Zhang, L., Qin, T., Liu, TY., Bao, Y., Li, H. (2007). N-Step PageRank for Web Search. In: Amati, G., Carpineto, C., Romano, G. (eds) Advances in Information Retrieval. ECIR 2007. Lecture Notes in Computer Science, vol 4425. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71496-5_63
Download citation
DOI: https://doi.org/10.1007/978-3-540-71496-5_63
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71494-1
Online ISBN: 978-3-540-71496-5
eBook Packages: Computer ScienceComputer Science (R0)