Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2556195.2556224acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Fast approximation of betweenness centrality through sampling

Published: 24 February 2014 Publication History

Abstract

Betweenness centrality is a fundamental measure in social network analysis, expressing the importance or influence of individual vertices in a network in terms of the fraction of shortest paths that pass through them. Exact computation in large networks is prohibitively expensive and fast approximation algorithms are required in these cases. We present two efficient randomized algorithms for betweenness estimation. The algorithms are based on random sampling of shortest paths and offer probabilistic guarantees on the quality of the approximation. The first algorithm estimates the betweenness of all vertices: all approximated values are within an additive factor ɛ from the real values, with probability at least 1-δ. The second algorithm focuses on the top-K vertices with highest betweenness and approximate their betweenness within a multiplicative factor ɛ, with probability at least $1-δ. This is the first algorithm that can compute such approximation for the top-K vertices. We use results from the VC-dimension theory to develop bounds to the sample size needed to achieve the desired approximations. By proving upper and lower bounds to the VC-dimension of a range set associated with the problem at hand, we obtain a sample size that is independent from the number of vertices in the network and only depends on a characteristic quantity that we call the vertex-diameter, that is the maximum number of vertices in a shortest path. In some cases, the sample size is completely independent from any property of the graph. The extensive experimental evaluation that we performed using real and artificial networks shows that our algorithms are significantly faster and much more scalable as the number of vertices in the network grows than previously presented algorithms with similar approximation guarantees.

References

[1]
I. Abraham et al. VC-dimension and shortest path algorithms. ICALP'11, 2011.
[2]
D. Aingworth et al. Fast estimation of diameter and shortest paths (without matrix multiplication). SIAM J. on Comput., 28 (4): 1167--1181, 1999.
[3]
J. M. Anthonisse. The rush in a directed graph. TR BN 9/71, Stichting Mathematisch Centrum, Amsterdam, Netherlands, 1971.
[4]
D. A. Bader et al. Approximating betweenness centrality. WAW'07, 2007.
[5]
A.-L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286 (5439): 509--512, 1999.
[6]
K. Boitmanis et al. Fast and simple approximation of the diameter and radius of a graph. WEA'06, 2006.
[7]
S. P. Borgatti and M. G. Everett. A graph-theoretic perspective on centrality. Soc. Net., 28 (4): 466--484, 2006.
[8]
U. Brandes. A faster algorithm for betweenness centrality. J. Math. Sociol., 25 (2): 163--177, 2001.
[9]
U. Brandes. On variants of shortest-path betweenness centrality and their generic computation. Soc. Net., 30 (2): 136--145, 2008.
[10]
U. Brandes and C. Pich. Centrality estimation in large networks. Int'l. J. Bifurc. and Chaos, 17 (07):2303--2318, 2007.
[11]
G. Csárdi and T. Nepusz. The igraph software package for complex network research. InterJournal, Complex Sys.: 1695, 2006. URL http://igraph.sf.net.
[12]
S. Dolev et al. Routing betweenness centrality. J. ACM, 57 (4): 25:1--25:27, May 2010.
[13]
D. Eppstein and J. Wang. Fast approximation of centrality. J. Graph Alg. and Appl., 8 (1): 39--45, 2004.
[14]
L. C. Freeman. A set of measures of centrality based on betweenness. Sociometry, 40: 35--41, 1977.
[15]
R. Geisberger et al. Better approximation of betweenness centrality. ALENEX'08, 2008.
[16]
S. Har-Peled and M. Sharir. Relative p,ɛ-approximations in geometry. Discr. & Comput. Geom., 45 (3): 462--496, 2011.
[17]
W. Hoeffding. Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc., 58 (301): 13--30, 1963.
[18]
R. Jacob et al. Algorithms for centrality indices. LNCS, 3418: 62--82, 2005.
[19]
H. Kaindl and G. Kainz. Bidirectional heuristic search reconsidered. J. Artif. Intell. Res., 7: 283--317, 1997.
[20]
Kourtellis et al. Identifying high betweenness centrality nodes in large social networks. Soc. Net. Analysis and Mining, pages 1--16, 2012.
[21]
Kranakis et al. The VC-dimension of set systems defined by graphs. Discr. Appl. Math., 77 (3): 237--257, 1997.
[22]
Y. Li et al. Improved bounds on the sample complexity of learning. J. Comp. and Sys. Sci., 62 (3): 516--527, 2001.
[23]
Y.-s. Lim et al. Online estimating the k central nodes of a network. NSW'11, 2011
[24]
M. Löffler and J. M. Phillips. Shape fitting on point sets with probability distributions. ESA'09, 2009.
[25]
A. Maiya and T. Berger-Wolf. Online sampling of high centrality individuals in social networks. PAKDD'10, 2010.
[26]
M. Mitzenmacher and E. Upfal. Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, 2005.
[27]
M. Mohri et al. Foundations of Machine Learning. The MIT Press, 2012.
[28]
M. E. J. Newman. phNetworks - An Introduction. Oxford University Press, 2010.
[29]
J. Pfeffer and K. M. Carley. k-centralities: local approximations of global measures based on shortest paths. WWW'12, 2012.
[30]
I. Pohl. Bidirectional Heuristic Search in Path Problems. PhD thesis, Stanford University, 1969.
[31]
M. Riondato and E. M. Kornaropoulos Fast Approximation of Betweenness Centrality through Sampling. (extended version) http://cs.brown.edu/matteo/papers/RiondatoKornarop-BetweennessSampling.pdf, 2013.
[32]
L. Roditty and V. V. Williams. Approximating the diameter of a graph. CoRR abs/1207.3622, 2012.
[33]
A. E. Sariyüce et al. Shattering and compressing networks for betweenness centrality. SDM'13, 2013.
[34]
V. N. Vapnik and A. J. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Prob. and its Appl., 16 (2): 264--280, 1971.

Cited By

View all
  • (2024)HertDroid: Android Malware Detection Method with Influential Node Filter and Heterogeneous Graph TransformerApplied Sciences10.3390/app1408315014:8(3150)Online publication date: 9-Apr-2024
  • (2024)TATKC: A Temporal Graph Neural Network for Fast Approximate Temporal Katz Centrality RankingProceedings of the ACM Web Conference 202410.1145/3589334.3645432(527-538)Online publication date: 13-May-2024
  • (2023)Betweenness centrality approximation in large networks using shortest paths approximation and adaptive samplingInternational Conference on Internet of Things and Machine Learning (IoTML 2023)10.1117/12.3013414(55)Online publication date: 29-Nov-2023
  • Show More Cited By

Index Terms

  1. Fast approximation of betweenness centrality through sampling

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining
    February 2014
    712 pages
    ISBN:9781450323512
    DOI:10.1145/2556195
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 February 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. betweenness centrality
    2. graph mining
    3. range set
    4. sampling
    5. social network analysis
    6. vc-dimension
    7. vertex diameter

    Qualifiers

    • Research-article

    Conference

    WSDM 2014

    Acceptance Rates

    WSDM '14 Paper Acceptance Rate 64 of 355 submissions, 18%;
    Overall Acceptance Rate 498 of 2,863 submissions, 17%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)97
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)HertDroid: Android Malware Detection Method with Influential Node Filter and Heterogeneous Graph TransformerApplied Sciences10.3390/app1408315014:8(3150)Online publication date: 9-Apr-2024
    • (2024)TATKC: A Temporal Graph Neural Network for Fast Approximate Temporal Katz Centrality RankingProceedings of the ACM Web Conference 202410.1145/3589334.3645432(527-538)Online publication date: 13-May-2024
    • (2023)Betweenness centrality approximation in large networks using shortest paths approximation and adaptive samplingInternational Conference on Internet of Things and Machine Learning (IoTML 2023)10.1117/12.3013414(55)Online publication date: 29-Nov-2023
    • (2023)Efficient Algorithm for Maximizing Betweenness Centrality in Large Networks2023 IEEE International Conference on High Performance Computing & Communications, Data Science & Systems, Smart City & Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys60770.2023.00093(647-654)Online publication date: 17-Dec-2023
    • (2023)Estimation and update of betweenness centrality with progressive algorithm and shortest paths approximationScientific Reports10.1038/s41598-023-44392-013:1Online publication date: 10-Oct-2023
    • (2023)Algorithms for Large-Scale Network Analysis and the NetworKit ToolkitAlgorithms for Big Data10.1007/978-3-031-21534-6_1(3-20)Online publication date: 18-Jan-2023
    • (2022)Recent Advances in Fully Dynamic Graph Algorithms – A Quick Reference GuideACM Journal of Experimental Algorithmics10.1145/355580627(1-45)Online publication date: 13-Dec-2022
    • (2022)An algorithm for updating betweenness centrality scores of all vertices in a graph upon deletion of a single edgeJournal of Complex Networks10.1093/comnet/cnac03310:4Online publication date: 8-Aug-2022
    • (2022)Sampling-Based Data Mining Algorithms: Modern Techniques and Case StudiesMachine Learning and Knowledge Discovery in Databases10.1007/978-3-662-44845-8_48(516-519)Online publication date: 10-Mar-2022
    • (2021)Finding Route Hotspots in Large Labeled NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2019.295692433:6(2479-2492)Online publication date: 1-Jun-2021
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media