Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Finding the Source in Networks: An Approach Based on Structural Entropy

Published: 27 March 2023 Publication History

Abstract

The popularity of intelligent devices provides straightforward access to the Internet and online social networks. However, the quick and easy data updates from networks also benefit the risk spreading, such as rumor, malware, or computer viruses. To this end, this article studies the problem of source detection, which is to infer the source node out of an aftermath of a cascade, that is, the observed infected graph GN of the network at some time. Prior arts have adopted various statistical quantities such as degree, distance, or infection size to reflect the structural centrality of the source. In this article, we propose a new metric that we call the infected tree entropy (ITE), to utilize richer underlying structural features for source detection. Our idea of ITE is inspired by the conception of structural entropy [21], which demonstrated that the minimization of average bits to encode the network structures with different partitions is the principle for detecting the natural or true structures in real-world networks. Accordingly, our proposed ITE based estimator for the source tries to minimize the coding of network partitions brought by the infected tree rooted at all the potential sources, thus minimizing the structural deviation between the cascades from the potential sources and the actual infection process included in GN. On polynomially growing geometric trees, with increasing tree heterogeneity, the ITE estimator remarkably yields more reliable detection under only moderate infection sizes, and returns an asymptotically complete detection. In contrast, for regular expanding trees, we still observe guaranteed detection probability of ITE estimator even with an infinite infection size, thanks to the degree regularity property. We also algorithmically realize the ITE based detection that enjoys linear time complexity via a message-passing scheme, and further extend it to general graphs. Extensive experiments on synthetic and real datasets confirm the superiority of ITE to the baselines. For example, ITE returns an accuracy of 85%, ranking the source among the top 10%, far exceeding 55% of the classic algorithm on scale-free networks.

References

[1]
Ameya Agaskar and Yue M. Lu. 2013. A fast Monte Carlo algorithm for source localization on graphs. In Proceeding of the Wavelets and Sparsity XV, Vol. 8858, 429–434.
[2]
Kartik Anand and Ginestra Bianconi. 2009. Entropy measures for networks: Toward an information theory of complex topologies. Physical Review E 80, 4 (2009), 045102.
[3]
Albert-László Barabási and Réka Albert. 1999. Emergence of scaling in random networks. Science 286, 5439 (1999), 509–512.
[4]
Ginestra Bianconi. 2009. Entropy of network ensembles. Physical Review E 79, 3 (2009), 036114.
[5]
D. Bonchev and N. Trinajstić. 1977. Information theory, distance matrix, and molecular branching. The Journal of Chemical Physics 67, 10 (1977), 4517–4533.
[6]
Samuel L. Braunstein, Sibasish Ghosh, and Simone Severini. 2006. The Laplacian of a graph as a density matrix: A basic combinatorial approach to separability of mixed states. Annals of Combinatorics 10, 3 (2006), 291–317.
[7]
Yun Chai, Youguo Wang, and Liang Zhu. 2021. Information sources estimation in time-varying networks. IEEE Transactions on Information Forensics and Security 16 (2021), 2621–2636.
[8]
Jaeyoung Choi, Sangwoo Moon, Jiin Woo, Kyunghwan Son, Jinwoo Shin, and Yung Yi. 2017. Rumor source detection under querying with untruthful answers. In Proceedings of the IEEE INFOCOM. 1–9.
[9]
Jaeyoung Choi and Yung Yi. 2018. Necessary and sufficient budgets in information source finding with querying: Adaptivity gap. In Proceedings of the IEEE ISIT. 2261–2265.
[10]
Yongwook Choi and Wojciech Szpankowski. 2009. Compression of graphical structures. In Proceedings of the IEEE ISIT. 364–368.
[11]
Cesar Henrique Comin and Luciano da Fontoura Costa. 2011. Identifying the starting point of a spreading process in complex networks. Physical Review E 84, 5 (2011), 056105.
[12]
Matthias Dehmer. 2008. Information processing in complex networks: Graph entropy and information functionals. Applied Mathematics and Computation 201, 1-2 (2008), 82–94.
[13]
Wenxiang Dong, Wenyi Zhang, and Chee Wei Tan. 2013. Rooting out the rumor culprit from suspects. In Proceedings of the IEEE ISIT. 2671–2675.
[14]
Vincenzo Fioriti and Marta Chinnici. 2012. Predicting the sources of an outbreak with a spectral technique. arXiv:1211.2333.
[15]
Jacob Goldenberg, Barak Libai, and Eitan Muller. 2001. Talk of the network: A complex systems look at the underlying process of word-of-mouth. Marketing Letters 12, 3 (2001), 211–223.
[16]
Jiaojiao Jiang, Sheng Wen, Shui Yu, Yang Xiang, and Wanlei Zhou. 2015. K-center: An approach on the multi-source identification of information diffusion. IEEE Transactions on Information Forensics and Security 10, 12 (2015), 2616–2626.
[17]
Nikhil Karamchandani and Massimo Franceschetti. 2013. Rumor source detection under probabilistic sampling. In Proceedings of the IEEE ISIT. 2184–2188.
[18]
Jérôme Kunegis. 2013. Konect: The koblenz network collection. In Proceedings of the WWW. 1343–1350.
[19]
Theodoros Lappas, Evimaria Terzi, Dimitrios Gunopulos, and Heikki Mannila. 2010. Finding effectors in social networks. In Proceedings of the ACM SIGKDD. 1059–1068.
[20]
Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. (June 2014). Retrieved May 10, 2021 from http://snap.stanford.edu/data.
[21]
Angsheng Li and Yicheng Pan. 2016. Structural information and dynamical complexity of networks. IEEE Transactions on Information Theory 62, 6 (2016), 3290–3339.
[22]
Yiwei Liu, Jiamou Liu, Zijian Zhang, Liehuang Zhu, and Angsheng Li. 2019. REM: From structural entropy to community structure deception. In Advances in Neural Information Processing Systems. 12938–12948.
[23]
Andrey Y. Lokhov, Marc Mézard, Hiroki Ohta, and Lenka Zdeborová. 2014. Inferring the origin of an epidemic with a dynamic message-passing algorithm. Physical Review E 90, 1 (2014), 012801.
[24]
Wuqiong Luo and Wee Peng Tay. 2013. Estimating infection sources in a network with incomplete observations. In Proceedings of the IEEE GlobalSIP. 301–304.
[25]
Wuqiong Luo and Wee Peng Tay. 2013. Finding an infection source under the SIS model. In Proceedings of the IEEE ICASSP. 2930–2934.
[26]
Wuqiong Luo, Wee Peng Tay, and Mei Leng. 2013. Identifying infection sources and regions in large networks. IEEE Transactions on Signal Processing 61, 11 (2013), 2850–2865.
[27]
Pedro C. Pinto, Patrick Thiran, and Martin Vetterli. 2012. Locating the source of diffusion in large-scale networks. Physical Review Letters 109, 6 (2012), 068702.
[28]
B. Aditya Prakash, Jilles Vreeken, and Christos Faloutsos. 2012. Spotting culprits in epidemics: How many and which ones?. In Proceedings of the ICDM. 11–20.
[29]
Nicolas Rashevsky. 1955. Life, information theory, and topology. Bulletin of Mathematical Biophysics 17, 3 (1955), 229–235.
[30]
C. Raychaudhury, S. K. Ray, J. J. Ghosh, A. B. Roy, and S. C. Basak. 1984. Discrimination of isomeric structures using information theoretic topological indices. Journal of Computational Chemistry 5, 6 (1984), 581–588.
[31]
Martin Rosvall and Carl T. Bergstrom. 2008. Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences of the United States of America 105, 4 (2008), 1118–1123.
[32]
Devavrat Shah and Tauhid Zaman. 2011. Rumors in a network: Who’s the culprit? IEEE Transactions on Information Theory 57, 8 (2011), 5163–5181.
[33]
Devavrat Shah and Tauhid Zaman. 2016. Finding rumor sources on random trees. Operations Research 64, 3 (2016), 736–755.
[34]
Claude E. Shannon. 1948. A mathematical theory of communication. The Bell System Technical Journal 27, 3 (1948), 379–423.
[35]
Wenchang Tang, Feng Ji, and Wee Peng Tay. 2018. Estimating infection sources in networks using partial timestamps. IEEE Transactions on Information Forensics and Security 13, 12 (2018), 3035–3049.
[36]
Zhaoxu Wang, Wenxiang Dong, Wenyi Zhang, and Chee Wei Tan. 2014. Rumor source detection with multiple observations: Fundamental limits and algorithms. In Proceedings of the ACM SIGMETRICS.
[37]
Duncan J. Watts and Steven H. Strogatz. 1998. Collective dynamics of “small-world” networks. Nature 393, 6684 (1998), 440–442.
[38]
Pei-Duo Yu, Chee Wei Tan, and Hung-Lin Fu. 2018. Rumor source detection in finite graphs with boundary effects by message-passing algorithms. In Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 175–192.
[39]
Kai Zhu, Zhen Chen, and Lei Ying. 2016. Locating the contagion source in networks with partial timestamps. Data Mining and Knowledge Discovery 30, 5 (2016), 1217–1248.
[40]
Kai Zhu and Lei Ying. 2014. Information source detection in the SIR model: A sample-path-based approach. IEEE/ACM Transactions on Networking 24, 1 (2014), 408–421.
[41]
Kai Zhu and Lei Ying. 2015. Source localization in networks: Trees and beyond. arXiv:1510.01814.

Cited By

View all
  • (2023)Evidence-Aware Fake News Detection: A Review2023 International Conference on Advanced Computing & Communication Technologies (ICACCTech)10.1109/ICACCTech61146.2023.00022(81-86)Online publication date: 23-Dec-2023

Index Terms

  1. Finding the Source in Networks: An Approach Based on Structural Entropy

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Internet Technology
      ACM Transactions on Internet Technology  Volume 23, Issue 1
      February 2023
      564 pages
      ISSN:1533-5399
      EISSN:1557-6051
      DOI:10.1145/3584863
      • Editor:
      • Ling Liu
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 27 March 2023
      Online AM: 06 February 2023
      Accepted: 09 October 2022
      Revised: 16 August 2022
      Received: 12 September 2021
      Published in TOIT Volume 23, Issue 1

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Source detection
      2. structural entropy
      3. graph theory
      4. inference algorithms

      Qualifiers

      • Research-article

      Funding Sources

      • NSF China
      • 100-Talents Program of Xinhua News Agency, and the Program of Shanghai Academic/Technology Research Leader

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)165
      • Downloads (Last 6 weeks)19
      Reflects downloads up to 23 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Evidence-Aware Fake News Detection: A Review2023 International Conference on Advanced Computing & Communication Technologies (ICACCTech)10.1109/ICACCTech61146.2023.00022(81-86)Online publication date: 23-Dec-2023

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media