Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1148170.1148225acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Generalizing PageRank: damping functions for link-based ranking algorithms

Published: 06 August 2006 Publication History

Abstract

This paper introduces a family of link-based ranking algorithms that propagate page importance through links. In these algorithms there is a damping function that decreases with distance, so a direct link implies more endorsement than a link through a long path. PageRank is the most widely known ranking function of this family.The main objective of this paper is to determine whether this family of ranking techniques has some interest per se, and how different choices for the damping function impact on rank quality and on convergence speed. Even though our results suggest that PageRank can be approximated with other simpler forms of rankings that may be computed more efficiently, our focus is of more speculative nature, in that it aims at separating the kernel of PageRank, that is, link-based importance propagation, from the way propagation decays over paths.We focus on three damping functions, having linear, exponential, and hyperbolic decay on the lengths of the paths. The exponential decay corresponds to PageRank, and the other functions are new. Our presentation includes algorithms, analysis, comparisons and experiments that study their behavior under different parameters in real Web graph data.Among other results, we show how to calculate a linear approximation that induces a page ordering that is almost identical to PageRank's using a fixed small number of iterations; comparisons were performed using Kendall's τ on large domain datasets.

References

[1]
R. Albert, H. Jeong, and A. L. Barabási. Diameter of the World Wide Web. Nature, 401:130--131, 1999.
[2]
A. Arasu, J. Cho, H. Garcia-Molina, A. Paepcke, and S. Raghavan. Searching the Web. ACM TOIT, 1(1):2--43, 2001.
[3]
R. Baeza-Yates and E. Davis. Web page ranking using link attributes. In Alt. track papers & posters, WWW Conf., pp. 328--329, New York, NY, USA, 2004.
[4]
R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison Wesley, 1999.
[5]
L. Becchetti, C. Castillo, D. Donato, S. Leonardi and R. Baeza-Yates. Using rank propagation and probabilistic counting for link-based spam detection. Tech. report DELIS-TR-0341, Dynamically-Evolving Large-Scale Inf. Sys. 2006.
[6]
K. Bharat and M. R. Henzinger. Improved algorithms for topic distillation in a hyperlinked environment. In Proc. of ACM SIGIR, pp. 104--111, Melbourne, Australia, 1998. ACM Press, New York.
[7]
P. Boldi. TotalRank: ranking without damping. In Poster Proc. of WWW Conf., pp. 898--899, Chiba, Japan, 2005. ACM Press.
[8]
P. Boldi, M. Santini, and S. Vigna. PageRank as a function of the damping factor. In Proc. of WWW Conf., pp. 557--566, Chiba, Japan, 2005. ACM Press.
[9]
B. Bollobás and O. Riordan. The diameter of a scale-free random graph. Combinatorica, 24(1):5--34, 2004.
[10]
M. Brinkmeier. PageRank revisited. ACM TOIT, 6(3):257--279, 2006.
[11]
Y.-Y. Chen, Q. Gan, and T. Suel. Local methods for estimating PageRank values. In Proc. of CIKM, pp. 381--389, New York, USA, 2004. ACM Press.
[12]
F. Chung and L. Lu. The diameter of random sparse graphs. Adv. Appl. Math., 26:257--279, 2001.
[13]
B. D. Davison. Topical locality in the Web. In Proc. of ACM SIGIR, pp. 272--279, Athens, Greece, 2000. ACM Press.
[14]
N. Eiron, K. S. Mccurley, and J. A. Tomlin. Ranking The web frontier. In Proc. of WWW Conf., pp. 309--318, New York, USA, 2004. ACM Press.
[15]
D. Fogaras. Where to start browsing the Web? In IICS, vol. 2877 of Springer LNCS, pp. 65--79, Leipzig, Germany, 2003.
[16]
A. Gulli and A. Signorini. The indexable Web is more than 11.5 billion pages. In Poster Proc. of WWW Conf., pp. 902--903, Chiba, Japan, 2005. ACM Press.
[17]
S. W. Haas and E. S. Grams. Page and link classifications: connecting diverse resources. In DL '98: Proc. of ACM Conf. on Digital libraries, pp. 99--107, New York, NY, USA, 1998. ACM Press.
[18]
T. Haveliwala and S. Kamvar. The condition number of the PageRank problem. Tech. Report 36, Stanford University, 2003.
[19]
T. Haveliwala and S. Kamvar. The second eigenvalue of the Google matrix. Tech. Report 20, Stanford University, 2003.
[20]
T. H. Haveliwala. Topic-sensitive PageRank. In Proc. of WWW Conf., pp. 517--526, Honolulu, Hawaii, USA, 2002. ACM Press.
[21]
W.-K. Joo and S. H. Myaeng. Improving retrieval effectiveness with hyperlink information. In Proc. of IRAL, Singapore, 1998.
[22]
L. Katz. A new status index derived from sociometric analysis. Psychometrika, 18:39--43, 1953.
[23]
J. M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5):604--632, 1999.
[24]
R. Kumar, P. Raghavan, S. Rajagopalan, D. Sivakumar, A. Tomkins, and E. Upfal. Stochastic models for the Web graph. In Proc. of FOCS, pp. 57--65, Redondo Beach, USA, 2000. IEEE CS Press.
[25]
Y. Li. Toward a qualitative search engine. IEEE Internet Comp., July 1998.
[26]
M. Lifantsev. Voting model for ranking Web pages. In Proc. of the Int. Conf. on Internet Computing, pp. 143--148, Las Vegas, Nevada, USA, 2000.
[27]
T.-Y. Liu and W.-Y. Ma. Webpage importance analysis using conditional Markov random walk. In Proc. of IEEE/WIC/ACM WI Conf., Compiegne, France, 2005. ACM Press.
[28]
M. Marchiori. The quest for correct information of the Web: hyper search engines. In Proc. WWW Conf., Santa Clara, USA, 1997.
[29]
F. Menczer. Lexical and semantic clustering by Web links. Journal of ASIST, 55(14):1261--1269, 2004.
[30]
M. E. Newman, S. H. Strogatz, and D. J. Watts. Random graphs with arbitrary degree distributions and their applications. Phys. Rev. E Stat. Nonlin. Soft Matter Phys., 64(2), 2001.
[31]
L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: bringing order to the Web. Tech. report, Stanford University, 1998.
[32]
G. Pandurangan, P. Raghavan, and E. Upfal. Using Pagerank to characterize Web structure. In Proc. of COCOON, vol. 2387 of LNCS, pp. 330--390, Singapore, 2002. Springer.
[33]
W. Rudin. Real and Complex Analysis. McGraw-Hill, May 1986.
[34]
P. Srinivasan, G. Pant, and F. Menczer. A general evaluation framework for topical crawlers. Information Retrieval, 8(3):417--447, 2005.
[35]
D. J. Watts and S. H. Strogatz. Collective dynamics of small-world networks. Nature, 393(6684):440--442, June 1998.
[36]
F. Wu, B. A. Huberman, L. A. Adamic, and J. R. Tyler. Information flow in social groups. Physica A: Statistical and Theoretical Physics, 337(1-2):327--335, June 2004.

Cited By

View all
  • (2024)DAppFL: Just-in-Time Fault Localization for Decentralized Applications in Web3Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652116(137-148)Online publication date: 11-Sep-2024
  • (2024)Breaking the Entanglement of Homophily and Heterophily in Semi-supervised Node Classification2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00188(2379-2392)Online publication date: 13-May-2024
  • (2024)Deep asymmetric nonnegative matrix factorization for graph clusteringPattern Recognition10.1016/j.patcog.2023.110179148(110179)Online publication date: Apr-2024
  • Show More Cited By

Index Terms

  1. Generalizing PageRank: damping functions for link-based ranking algorithms

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
    August 2006
    768 pages
    ISBN:1595933697
    DOI:10.1145/1148170
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 August 2006

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. link analysis
    2. link-based ranking
    3. web graphs

    Qualifiers

    • Article

    Conference

    SIGIR06
    Sponsor:
    SIGIR06: The 29th Annual International SIGIR Conference
    August 6 - 11, 2006
    Washington, Seattle, USA

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)19
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 12 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)DAppFL: Just-in-Time Fault Localization for Decentralized Applications in Web3Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652116(137-148)Online publication date: 11-Sep-2024
    • (2024)Breaking the Entanglement of Homophily and Heterophily in Semi-supervised Node Classification2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00188(2379-2392)Online publication date: 13-May-2024
    • (2024)Deep asymmetric nonnegative matrix factorization for graph clusteringPattern Recognition10.1016/j.patcog.2023.110179148(110179)Online publication date: Apr-2024
    • (2023)A Comparative Analysis of Retrievability and PageRank MeasuresProceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation10.1145/3632754.3632760(67-72)Online publication date: 15-Dec-2023
    • (2023)Modulating the dynamics of NFκB and PI3K enhances the ensemble-level TNFR1 signaling mediated apoptotic responsenpj Systems Biology and Applications10.1038/s41540-023-00318-09:1Online publication date: 16-Nov-2023
    • (2023)Axiomatic characterization of PageRankArtificial Intelligence10.1016/j.artint.2023.103900318(103900)Online publication date: May-2023
    • (2023)On new PageRank computation methods using quantum computingQuantum Information Processing10.1007/s11128-023-03856-y22:3Online publication date: 8-Mar-2023
    • (2017)Unsupervised Ranking using Graph Structures and Node AttributesProceedings of the Tenth ACM International Conference on Web Search and Data Mining10.1145/3018661.3018668(771-779)Online publication date: 2-Feb-2017
    • (2017)Incremental maintenance of C-Rank scores in dynamic web environment2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC.2017.8122838(1570-1574)Online publication date: Oct-2017
    • (2017)AptRank: an adaptive PageRank model for protein function prediction on   bi-relational graphsBioinformatics10.1093/bioinformatics/btx02933:12(1829-1836)Online publication date: 14-Feb-2017
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media