Nothing Special   »   [go: up one dir, main page]

Academia.eduAcademia.edu

Interpreting Graph-Based Sybil Detection Methods as Low-Pass Filtering

IEEE Transactions on Information Forensics and Security

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 18, 2023 1225 Interpreting Graph-Based Sybil Detection Methods as Low-Pass Filtering Satoshi Furutani , Toshiki Shibahara , Mitsuaki Akiyama , Member, IEEE, and Masaki Aida , Senior Member, IEEE Abstract— Online social networks (OSNs) are threatened by Sybil attacks, which create fake accounts (also called Sybils) on OSNs and use them for various malicious activities. Therefore, Sybil detection is a fundamental task for OSN security. Most existing Sybil detection methods are based on the graph structure of OSNs, and various methods have been proposed recently. However, although almost all methods have been compared experimentally in terms of detection performance and noise robustness, theoretical understanding of them is still lacking. In this study, we show that existing graph-based Sybil detection methods can be interpreted in a unified framework of low-pass filtering. This framework enables us to theoretically compare and analyze each method from two perspectives: filter kernel properties and the spectrum of shift matrices. Our analysis reveals that the detection performance of each method depends on the effectiveness of the low-pass filtering. Furthermore, on the basis of the analysis, we propose a novel Sybil detection method called SybilHeat. Numerical experiments on synthetic graphs and real social networks demonstrate that SybilHeat performs consistently well on graphs with various structural properties. This study lays a theoretical foundation for graph-based Sybil detection and leads to a better understanding of Sybil detection methods. Index Terms— Online social networks, Sybil detection, graph signal processing. I. I NTRODUCTION O NLINE Social Networks (OSNs) are essential platforms for people to interact with each other, communicate information, and spread social influence. According to the Pew Research Center’s report [1], about 70% of Americans were on Facebook in 2021, and seven in ten of them visited the site daily. However, OSNs are under threat from Sybil attacks, which create fake accounts (also called Sybils) on OSNs and use them for various malicious activities, such as distributing spam, phishing URLs, and malware, and manipulating public opinion and the stock market by spreading fake news. For example, Sybils have been exploited to propagate anti-vaccine Manuscript received 19 June 2022; revised 9 December 2022; accepted 9 January 2023. Date of publication 18 January 2023; date of current version 23 January 2023. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Issa Traore. (Corresponding author: Satoshi Furutani.) Satoshi Furutani is with NTT Social Informatics Laboratories, Tokyo 180-8585, Japan, and also with the Graduate School of Systems Design, Tokyo Metropolitan University, Tokyo 191-0065, Japan (e-mail: satoshi.furutani.ek@hco.ntt.co.jp). Toshiki Shibahara and Mitsuaki Akiyama are with NTT Social Informatics Laboratories, Tokyo 180-8585, Japan (e-mail: toshiki.shibahara.de@ hco.ntt.co.jp; akiyama@ieee.org). Masaki Aida is with the Graduate School of Systems Design, Tokyo Metropolitan University, Tokyo 191-0065, Japan (e-mail: aida@tmu.ac.jp). Digital Object Identifier 10.1109/TIFS.2023.3237364 messages [2], [3] and manipulate online political discussions [4], [5]. Therefore, Sybil detection is a fundamental task for OSN security. The common Sybil detection approach is the graph-based approach that detects Sybils on the basis of the graph structure of OSNs (i.e., the friendship relation between users on OSNs). This approach is motivated by the following observation: Sybils tend to be densely connected to other Sybils and sparsely connected to benign users because malicious attackers can easily control the connection between Sybils while they cannot control the connection between Sybils and benign users [6]. Therefore, it is expected that one can distinguish between a Sybil region and a benign region by exploiting the graph structure of OSNs. Most graph-based methods predict unknown node labels (Sybil or benign) by assigning a prior reputation score to each node using known node labels and then updating and propagating the reputation score locally on a graph. Two kinds of propagation algorithms are often used: random walkbased and loopy belief propagation-based. Random walkbased methods [7], [8], [9], [10] propagate the trust or badness score by random walks from known benign or Sybil nodes and rank the Sybil-likeness of unknown nodes. Loopy belief propagation methods [11], [12], [13], [14], [15] model the OSN structure as a pairwise Markov random field and compute the marginal distribution for each node (i.e., the probability that a node is Sybil) by a loopy belief propagation algorithm or its approximation. However, although various Sybil detection methods have been proposed over the past decade, almost all methods have been compared just experimentally in terms of detection performance and noise robustness. Since experimental results often depend on experimental conditions, such as dataset properties and experimental settings, a good result for an experimental condition does not guarantee the same for other ones. To understand why and under what conditions each method works well, we need to compare them theoretically. To this end, in our previous work [16], we formulated the random walk with restart and the loopy belief propagation algorithm as low-pass filtering and attempted a theoretical comparison of the performance of random walk-based and loopy belief propagation-based Sybil detection methods. However, this work does not provide a comprehensive comparison of existing detection methods (only a comparison between CIA [7] and SybilBelief [11]), nor can it explain the differences in detection performance for differences in structural properties of graphs (such as degree heterogeneity and modularity). This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ 1226 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 18, 2023 In this study, extending our previous work [16], we show that existing representative graph-based Sybil detection methods (CIA [7], SybilRank [8], SybilWalk [10], SybilBelief [11], and SybilSCAR [15]) can be interpreted in a unified framework of low-pass filtering. This framework enables us to theoretically compare and analyze each method from two perspectives: filter kernel properties and the spectrum of shift matrices. Our analysis reveals that the detection performance of each method depends on how good the low-pass filtering is. In other words, for a Sybil detection method to perform well, 1) the filter kernel must properly emphasize (remove) low (high) frequency components, and 2) the low frequency eigenvectors of the shift matrix must have high community detectability. Furthermore, on the basis of the analysis, we propose a novel detection method, called SybilHeat, with the filter kernel and the shift matrix that satisfies the above two requirements. Our main contribution are summarized as follows: • We present the low-pass filtering framework for theoretically comparing and analyzing Sybil detection methods and identify the requirements for high performance of a Sybil detection method. • We propose a Sybil detection method called SybilHeat that performs consistently better than other methods on graphs with various structural properties. • We demonstrate the validity of our analysis and the performance of the proposed detection method through numerical experiments on synthetic graphs generated by the stochastic block model (SBM) and real social networks. The rest of this paper is organized as follows: In Section II and Section III, we present related work and preliminaries, respectively. We interpret the existing methods as low-pass filtering in Section IV. In Section V, we provide a theoretical comparison of the existing methods based on the interpretation and discuss our proposed method, SybilHeat. In Section VI, we evaluate the validity of our analysis and the performance of SybilHeat through numerical experiments. Finally, Section VII concludes our paper. II. R ELATED W ORK In this section, we give a brief overview of Sybil detection methods. Random walk-based methods [7], [8], [9], [10], [17], [18], [19] detect Sybils by random walks from known labeled nodes on a graph. SybilGuard [17] and SybilLimit [18] detect Sybils using special random walks called random routes. In a normal random walk, the destination of a walker is randomly chosen for each step, whereas in random routes, the destination is predetermined by the permutation πv for each node v. That is, random routes that enter from an edge e always exit from edge πv (e). An unlabeled node is approved as a benign node when the random routes originating from it intersect with the random routes from a known benign node. SybilInfer [19] builds a probabilistic model of benign regions and uses it to detect potential Sybil regions. SybilGuard, SybilLimit, and SybilInfer are not scalable to large OSNs and are not robust to label noise because they only use information from known benign nodes. CIA [7] propagates the badness score of each node by random walks with restart from known Sybil nodes. SybilRank [8] evaluates the trust score of each node by computing the landing probability of early-terminated random walks from known benign nodes. Íntegro [9] improves SybilRank by learning the edge weights and then considering random walks on the weighted graph. The random walkbased method described above has the limitation that only labeled benign nodes or labeled Sybil nodes can be used (not both). To overcome this problem, SybilWalk [10] computes the badness score of each node by random walks on the augmented graph with two additional nodes (Sybil label node and benign label node). Loopy belief propagation methods [11], [12], [13], [14], [15] model the OSN structure as a pairwise Markov random field and compute the marginal distribution for each node (i.e., the probability that a node is Sybil) by a loopy belief propagation algorithm or its approximation. SybilBelief [11] first assigns the prior probability to each node using known node labels and then uses loopy belief propagation to calculate the posterior probability of them. Later studies [12], [13], [14] have demonstrated that learning and exploiting node and edge features improve the performance of Sybilbelief. SybilBelief and its variants rely on loopy belief propagation for inference, which is not scalable and has no convergence guarantees. Wang et al. [15] provided a general framework that integrates random walk-based and loopy belief propagationbased methods and proposed SybilSCAR, a random walklike score propagation algorithm, by approximating the loopy belief propagation algorithm. SybilSCAR is more scalable than SybilBelief, and convergence is guaranteed. However, this framework does not provide theoretical insight into the performance of existing Sybil detection methods. Other Sybil detection methods include behavior-based detection methods [20], [21], [22], [23], [24], [25], [26]. They often use machine learning to classify users into benign or Sybil on the basis of their social behavior. Most of them consist of two steps: 1) extracting behavior-based (and sometimes graph-based) features that contribute to Sybil detection (e.g., tweet content and timing, follower/followee information, node centrality, etc.), and then 2) constructing a detection model using the extracted features. A major limitation of behaviorbased methods is that attackers can easily imitate the behavior of benign users, thereby compromising the effectiveness of the method. III. P RELIMINARIES A. Graph Signal Processing In this subsection, we briefly introduce the basic concepts of graph signal processing [27], [28]. Let G = (V, E) be an unweighted undirected graph without self-loops and multiple edges, where V = {1, 2, . . . , N } is the node set and E ⊂ V × V is the edge set. A graph signal x : V → R is the realvalued function defined on the node set V and is represented as N -dimensional vector x = (x1 , x2 , . . . , x N ). A shift matrix S = [Si j ] ∈ R N ×N is a matrix such that the off-diagonal element Si j ̸= 0 iff (i, j) ∈ E. When the graph signal x is multiplied by the shift matrix S, each element of the shifted signal x̃ = Sx is a linear combination of the signal value FURUTANI et al.: INTERPRETING GRAPH-BASED SYBIL DETECTION METHODS AS LOW-PASS FILTERING of its adjacent nodes, that is, the original graph signal is shifted over the graph. In general, the adjacency matrix and Laplacian matrix are often used as the shift matrix [27], [28]. The adjacency matrix A = [Ai j ] ∈ R N ×N is a real symmetric matrix defined as Ai j = 1 if (i, j) ∈ E and Ai j = 0 otherwise. The Laplacian matrix is defined as L := D − A where P N D := diag(d1 , d2 , . . . , d N ) is the degree matrix and di := i=1 Ai j is node i’s degree. We define the diagonal matrix 3 := diag(λ1 , λ2 , . . . , λ N ) with eigenvalues λ1 ≤ λ2 ≤ · · · ≤ λ N of S and the matrix V = (v 1 , v 2 , . . . , v N ) with the eigenvector v µ corresponding to λµ . The graph Fourier transform (GFT) of x is defined as x̂ := V −1 x and inverse GFT is x := V x̂. The graph filtering (also called graph convolution) of input signal x in is defined as x out = V h(3)V −1 x in , (1) where h(3) := diag(h(λ1 ), h(λ2 ), . . . , h(λ N )) and h(λ) is a filter kernel function defined on the region [λ1 , λ N ]. As with filtering in the classical signal processing, the graph filtering operation is interpreted as transforming the graph signal into the frequency domain signal by GFT, multiplying filter h(λ), and then transforming back into the graph signal by the inverse GFT. This outputs a signal in which specific frequency components of the input signal are amplified or attenuated. B. Stochastic Block Model A typical structural feature of real-world social networks is the existence of community structure. Roughly speaking, the community is a group of nodes that are densely connected within a group and sparsely connected between groups. One of the most basic models for generating random graphs with communities is the SBM [29]. Denoting k communities by C1 , C2 , . . . , Ck , SBM assumes node i and node j are connected with the probability Cl(i),l( j) , Pr(Ai j = 1) = N (2) where the symmetric matrix C = [Cab ] ∈ Rk×k is the connectivity matrix and Cab /N is the probability of the edge being connected between nodes belonging to Ca and Cb . The map l : V → {1, 2, . . . , k} assigns a community to each node. As a special case, let us consider the SBM with two symmetric communities (|C1 | = |C2 | = N /2) and denote Cab = cin if a = b and Cab = cout if a ̸ = b. In this case, it was conjectured in [30] that two communities are detectable if and only if the inequality r cin + cout cin − cout > (3) 2 2 holds, and it was proved in [31] and [32]. The main limitation of the SBM is that all nodes within each community have the same average degree. Degree-Corrected SBM (DCSBM) [33] is a more realistic model, which takes into account the degree heterogeneity of nodes within a 1227 community. DCSBM connects node i and node j with the probability Pr(Ai j = 1) = θi θ j Cl(i),l( j) , N (4) where θi is the intrinsic connectivity of node i. For each node i, θi is randomly sampled from the distribution p(θ ) with E[θ] = 1 and E[θ 2 ] = 8. The intrinsic connectivity is proportional to the expected node degree (i.e., E[di ] ∝ θi ) and produces an arbitrary degree distribution. DCSBM includes SBM as a special case where ∀i. θi = 1. In [34], the detectable condition (3) for SBM is generalized to DCSBM as r cin + cout cin − cout > . (5) 2 28 IV. I NTERPRETATION OF S YBIL D ETECTION AS L OW-PASS F ILTERING In this section, we explain how to interpret existing graphbased Sybil detection methods as low-pass filtering. For an undirected graph G = (V, E), let Vs ⊂ V be the set of labeled Sybil nodes and Vb ⊂ V be the set of labeled benign nodes. Given a prior reputation score q = (q1 , q2 , . . . , q N )⊤ and a graph G, existing graph-based Sybil detection methods can be understood as methods that iteratively update the reputation (t) (t) (t) score p(t) = ( p1 , p2 , . . . , p N )⊤ at step t following a certain update rule   p(t) = f p(t−1) ; q, G (6) until convergence, and then predict a label of each node using the final score p = limt→∞ p(t) [35]. The prior reputation score q and the update rule f (·) differ from method to method. We here consider reformulating (6) as the low-pass filtering p = V h(3)V −1 q, (7) where 3 and V are the diagonal matrix of eigenvalues of a shift matrix S and the invertible matrix consisting of its eigenvectors, respectively. h(·) is the low-pass filter kernel. This formulation gives a low-pass filtering interpretation to the existing Sybil detection methods and enables a theoretical comparison between them. Hereafter, we describe the low-pass filtering interpretation of the following representative Sybil detection methods: CIA [7], SybilRank [8], SybilWalk [10], SybilBelief [11], and SybilSCAR [15]. Note that, for simplicity, we here consider unweighted undirected graphs, but our approach is easy to extend to weighted ones. A. CIA CIA [7] propagates the badness score of each node by random walks with restart from labeled Sybil nodes Vs . For a restart parameter 0 < α < 1, the update rule is given by p(t) = α A D−1 p(t−1) + (1 − α) p(0) . (0) (8) The initial score of node i is set to pi = 1 if i ∈ Vs and (0) pi = 0 if i ̸∈ Vs . Denoting q = p(0) , we have the fixed 1228 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 18, 2023 point p of (8) as p = (1 − α)(I − α A D−1 )−1 q = (1 − α)(I − α(I − Lrw ))−1 q = V r (1 − α)(I − α(I − 3r ))−1 V −1 r q, (9) where 3r and V r are matrices consisting of eigenvalues and eigenvectors of the random walk Laplacian Lrw := I − A D−1 , respectively. B. SybilRank SybilRank [8] evaluates the trust score of each node by computing the landing probability of early-terminated random walks from labeled benign nodes Vb . This is motivated by the hypothesis that since the connection between Sybil and benign nodes is sparse, a random walk starting from a benign node and terminating in a finite step is less likely to reach a Sybil node, and thus the landing probability is higher for benign nodes and lower for Sybil nodes. Setting the initial score to (0) (0) pi = 1/|Vb | if i ∈ Vb and pi = 0 if i ̸ ∈ Vb , the trust score (t) p is updated by p (t) = AD −1 (t−1) p . (10) SybilRank calculates the final trust score by terminating the above update equation at a finite step 0 = O(log N ) and then normalizing the trust score by the degree to eliminate the (0) degree bias (i.e., pi = pi /di ). Since p(0) = ( A D−1 )0 p(0) , we have the final trust score p = D−1 (I − Lrw )0 q = D−1 V r (I − 3r )0 V −1 r q, (11) p(0) . with q = Therefore, SybilRank can be interpreted as the operation combining the low-pass filtering by Lrw and degreenormalization. C. SybilWalk SybilWalk [10] computes the badness score of each node by b = (V ∪ {ls , lb }, E), b random walks on the augmented graph G obtained by adding two label nodes (Sybil label node ls and benign label node lb ) to an original graph G. For the b label nodes ls and lb are respectively augmented graph G, connected to known Sybil nodes and known benign nodes b = E ∪ {(i, lb ) | i ∈ Vb } ∪ {(i, ls ) | i ∈ Vs }). The badness (i.e., E score for each node i ∈ V is calculated as the probability that a random walk starting from node i will reach ls before reaching lb as follows: X ai j (t−1) (t) , (12) p pi = dbi j j∈b ∂i b where dbi is the degree of node i in the augmented graph G (i.e., dbi = di + 1 if i ∈ Vb ∪ Vs and dbi = di otherwise) and b b The badness scores ∂i is the set of neighbors of node i in G. of label nodes are given by plb = 0 and pls = 1. Indeed, SybilWalk is equivalent to an absorbing Markov chain with lb and ls as absorbing nodes. For a random walk on −1 b the transition matrix between user nodes is b G, D A ∈ R N ×N where b D := diag(db1 , db2 , . . . , dbN ), and the transition matrix from user nodes to label nodes is given by Q = (q b , q s ) ∈ R N ×2 . Here, each component of q s is defined as qsi = 1/dbi if i ∈ Vs and qsi = 0 if i ̸∈ Vs , and q b is defined in the same way. Hence, (12) is rewritten as p(t) = 5 p(t−1) by using the transition matrix ! −1 b D A Q , 5 := O I2 where I 2 is the 2 × 2 identity matrix. Therefore, we have   ! .. −1 O (I − b D A)−1 Q  .  p = lim 5t p(0) = 0 t→∞ O I2 1 −1 −1 = (I − b D A)−1 q s = V a 3−1 a V a qs , (13) where 3a and V a are matrices consisting of eigenvalues and eigenvectors of the augmented normalized Laplacian Laug := −1 I−b D A, respectively. D. SybilBelief SybilBelief [11] models the OSN structure as a pairwise Markov random field and computes the marginal distribution for each node (i.e., the probability that a node is Sybil) by the standard loopy belief propagation [36]. Let us associate a random variable si ∈ {−1, +1} to each node i ∈ V . si = +1 means that node i is Sybil, and si = −1 means that it is benign. The pairwise Markov random field is defined as p(s1 , s2 , . . . , s N ) = 1 Z Y (i, j)∈E ψi j (si , s j ) Y φi (si ), (14) i∈V where Z is a normalization constant (called partition function), and φi (si ) and ψi j (si , s j ) are node and edge potential functions defined as follows, respectively: ( qi if si = +1 φi (si ) = , (15) 1 − qi if si = −1 ( wi j if si s j = +1 ψi j (si , s j ) = (16) 1 − wi j if si s j = −1, where E⃗ is the set of oriented edges of E and satisfies ⃗ = 2|E|. We can determine whether a node i is Sybil or | E| not by evaluating the marginal distribution pi (si ). However, it is exponentially hard to compute the marginal distribution directly from the joint distribution in (14). The loopy belief propagation algorithm is a common method to calculate an approximate marginal distribution bi (si ) ≈ pi (si ). This algorithm iteratively updates the probability distribution (called ⃗ The message) µi j (s j ) for each directed edge (i, j) ∈ E. message from node i to node j at step t + 1 is given by (t+1) µi j (s j ) = Y (t) 1 X φi (si ) ψi j (si , s j ) µki (si ), (17) Zi j si =±1 k∈∂i\ j where ∂i\ j := ∂i \ { j} is the set of neighbors of node i excluding recipient node j. By using a converged message FURUTANI et al.: INTERPRETING GRAPH-BASED SYBIL DETECTION METHODS AS LOW-PASS FILTERING 1229 µi∞j , the approximate marginal distribution bi (si ) is computed as Y 1 (18) bi (si ) = φi (si ) µ∞ ki (si ). Zi k∈∂i If G is a tree (G has no loops), bi (si ) is exactly equal to pi (si ). If G has loops, bi (si ) is not equal to pi (si ) but often provides a good approximation [37]. Since the loopy belief propagation algorithm is nonlinear, we have to linearize (17) to represent it as (7). In our previous study [16], we linearize (17) around a fixed point and reformulate SybilBelief as low-pass filtering by using a Bethe-Hessian matrix. The Bethe-Hessian matrix is defined as H(r ) := (r 2 − 1)I + D − r A, (19) with the parameter r ∈ R. When r = 1, the Bethe-Hessian matrix becomes the Laplacian matrix L. Hereafter, unless otherwise noted, we set the P parameter P r of the Bethe-Hessian H(r ) is set to r = [( i di2 )/( i di ) − 1]1/2 as in [38]. This setting has the advantage that informative eigenvalues are negative while the bulk of uninformative eigenvalues is positive, making them easy to distinguish between them. The low-pass filtering interpretation of SybilBelief is given by p = V H g(3H )V −1 H q, TABLE I S UMMARIZATION OF L OW-PASS F ILTERING I NTERPRETATION OF SENTATIVE S YBIL D ETECTION M ETHODS R EPRE - (20) where 3H and V H are matrices consisting of eigenvalues and eigenvectors of H(r ), respectively. The function g(λ) is the ideal low-pass filter kernel; i.e., g(λ) = 1 if λ ≤ λ′ and g(λ) = 0 otherwise. For details on the derivation of (20), see Appendix. E. SybilSCAR SybilBelief has the limitation of low scalability and no convergence guarantee of (17). To overcome these limitations, SybilSCAR [15] computes the probability pi of each node i being Sybil by approximating (17) by replacing ∂i\ j with ∂i. SybilSCAR assigns the prior probability qi to each node as   0.5 + θ if i ∈ Vb qi = 0.5 − θ if i ∈ Vs ,   0.5 otherwise where θ ∈ (0, 0.5] indicates assigning high prior probabilities to labeled Sybil nodes. For a variable y, let us define a residual variable y̌ := y − 1/2. The update function of SybilSCAR is given by p̌(t) = 2 W̌ p̌(t−1) + q̌, Fig. 1. Schematic diagram of low-pass filtering interpretation of Sybil detection. (21) where W̌ = (w̌i j ) is a residual weight matrix with the element w̌i j = 0 if (i, j) ̸ ∈ E, and we set to p̌(0) = q̌. In [15], two SybilSCAR algorithms are proposed: SybilSCAR-C and SybilSCAR-D. For an edge (i, j) ∈ E, SybilSCAR-C has a constant residual weight (i.e., w̌i j = 1/(2dmax )), while SybilSCAR-D has a degree-normalized residual weight (i.e., w̌i j = 1/(2d j )). For a fixed point p̌ of (21), we have p̌ = 2 W̌ p̌ + q̌ = (I − 2 W̌ )−1 q̌, and thus SybilSCAR-C is rewritten as  −1 1 −1 p̌ = I − q̌ = V m 3−1 (22) A m V m q̌, dmax where 3m and V m are matrices consisting of eigenvalues and eigenvectors of the maximum degree-normalized Laplacian 1 Lmax := I − dmax A, respectively. Also, SybilSCAR-D is rewritten as  −1 −1 q̌ = V r 3−1 (23) p̌ = I − A D−1 r V r q̌. V. T HEORETICAL C OMPARISONS In Section IV, we described the low-pass filtering interpretation of some representative Sybil detection methods. As shown in Table I, the differences between these methods can be attributed to the differences in their corresponding shift matrix and filter kernel. Low-pass filtering is an operation consisting of the following three steps (see Fig. 1): (i) transforming a graph signal into a frequency signal by the GFT defined by the spectrum of a certain shift matrix, (ii) extracting (removing) low (high) frequency components by multiplying a low-pass filter to it, and then (iii) transforming it back to a (modified) graph signal by the inverse GFT. Thus, the output of lowpass filtering depends on the property of the low-pass filter kernel and the choice of shift matrix (i.e., what Fourier basis is used for the frequency transform). As is well known in the context of spectral clustering and graph signal processing, the eigenvectors corresponding to the small (low frequency) eigenvalues of the Laplacian contain rich information about the 1230 Fig. 2. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 18, 2023 Low-pass filter kernels of existing Sybil detection methods. global community structure of a graph, while the eigenvectors corresponding to large (high frequency) eigenvalues contain noisy information [27], [39]. Thus, the performance of the Sybil detection method is expected to depend on how well the low pass filtering can extract the low frequency components and remove the high frequency components of the input signal. In this section, we compare and analyze each method from two perspectives: filter kernel properties and the spectrum of the shift matrix. Furthermore, on the basis of the theoretical insights, we propose a novel Sybil detection method called SybilHeat. A. Filter Kernel Properties Figure 2 plots the four different filter kernels in Table I. First, the CIA filter kernel h(λ) = (1 − α)/(1 − α(1 − λ)) does not remove high frequency components sufficiently, so the output signal may be affected by noisy high frequency components. For this reason, CIA is expected to have poor detection performance and noise robustness. Next, the SybilRank filter kernel h(λ) = (1 − λ)0 removes frequencies in the middle range (0.5 ≤ λ ≤ 1.5) but passes high frequency components (λ > 1.5). Therefore, if the largest eigenvalue λ N takes a large value, SybilRank may be strongly affected by high frequency components. As described below, since the largest eigenvalue λrN of the random walk Laplacian tends to become larger for a sparse graph, SybilRank is expected to perform poorly on sparse graphs. The filter kernel h(λ) = 1/λ strongly emphasizes low frequency components, and thus the contribution of high frequency components is relatively small. Since h(λ) → ∞ for λ → 0, the contribution of the eigenvector v 1 corresponding to the smallest eigenvalue, which is uninformative in general, is dominant. Particularly, since the smallest eigenvalue of Lrw is 0, SybilSCAR-D may fail detection. However, if low frequency eigenvalues and high frequency eigenvalues are sufficiently separated, that is, the eigengap between informative and uninformative eigenvalues, |λk − λk+1 |, is sufficiently large, high detection performance and noise robustness are expected. The filter kernel corresponding to SybilBelief equally extracts the low frequency components and completely removes the high frequency components, by definition. Therefore, SybilBelief is expected to exhibit high detection performance and noise robustness as long as the low frequency eigenvectors are informative. B. Spectrum of Shift Matrices The output of low-pass filtering for a graph signal depends on how the frequencies (eigenvalues) are distributed and what Fourier basis is used for frequency transformation (i.e., the choice of shift matrix). We here discuss each method focusing on the spectrum (eigenvalues and eigenvectors) of the shift matrix. 1) Eigenvalues of Shift Matrices: First, we discuss detection methods in terms of eigenvalues of the shift matrix. Although frequencies are sampled evenly in classical signal processing, in graph signal processing, however, the frequency (eigenvalue) distribution is uneven and differs depending on the shift matrix. Since the quality of the low-pass filtering is determined by how well it can extract low frequency components, it is anticipated that the more clearly low frequency eigenvalues are isolated from the bulk of high frequency eigenvalues on the eigenvalue distribution, the better the low-pass filtering is. Figure 3 shows the spectral distribution of shift matrices for a dense and sparse modular graph generated by SBM. For dense modular graphs, k small (low frequency) eigenvalues are clearly isolated from the bulk of high frequency eigenvalues for each shift matrix. On the other hand, for sparse modular graphs, the bulk of uninformative eigenvalues is spread out, making them difficult to distinguish informative and uninformative eigenvalues. This suggests that the high frequency components are difficult to sufficiently remove by low-pass filtering for sparse graphs, and thus the detection performance of any detection method will be worse than that of the dense graph case. 2) Eigenvectors of Shift Matrices: Next, we discuss detection methods in terms of the eigenvectors of the shift matrix. Since Sybil detection is essentially a problem of identifying Sybil and benign regions of a graph, the more informative the low frequency eigenvectors of each shift matrix are about the community structure of the graph, the better the detection performance is likely to be. Therefore, to measure the informativeness of low frequency eigenvectors, we evaluate the community detectability of each shift matrix. Specifically, we estimate communities by spectral clustering algorithm (Algorithm 1) with each shift matrix of graphs with k = 2 communities generated by SBM and DCSBM and compare the Normalized Mutual Information (NMI) score of true and estimated communities. Figure 4 shows the community detectability of low frequency eigenvectors of each shift matrix for sparse modular graphs generated by SBM and DCSBM. The vertical dashed lines represent the detectability threshold on the right-hand side of equations (3) and (5), respectively. First, for SBM graphs, all shift matrices can detect communities in the detectable region, and in particular, the Bethe-Hessian shows the highest detection performance. For DCSBM graphs, the FURUTANI et al.: INTERPRETING GRAPH-BASED SYBIL DETECTION METHODS AS LOW-PASS FILTERING 1231 Fig. 3. Spectral distributions of shift matrices for a dense modular graph generated by SBM with N = 3000, k = 2, dave = 20, cout = 1 (upper panels) and a sparse modular graph generated by SBM with N = 3000, k = 2, dave = 5, cout = 1 (lower panels). In each panel, the mark (“ | ”) stemming from the baseline is the position of eigenvalues, and the red arrow points to the second smallest eigenvalue λ2 . Algorithm 1 Spectral clustering algorithm Input: Shift matrix S, the number k of communities 1: Compute k smallest eigenvectors of S and construct the N × k eigenvector matrix V k = (v 1 , v 2 , . . . , v k ) 2: Normalize the rows of V k 3: For i = 1, 2, . . . , N , let yi be the vector corresponding to the i-th row of V k N with k-means into k communi4: Cluster the points { yi }i=1 ties Output: Estimated communities Cˆ1 , Cˆ2 , . . . , Cˆk Fig. 4. Community detectability of low frequency eigenvectors of the random walk Laplacian, augmented normalized Laplacian, maximum degree-normalized Laplacian, Bethe-Hessian, and regularized Laplacian for sparse modular graphs generated by SBM (left) and DCSBM (right). The vertical dotted line shows the detectability threshold. Parameters are N = 1000, k = 2, dave = 5, and θi = 1 for SBM and θi ∼ [U (3, 7)]3 for DCSBM. Simulations are averaged over 100 runs. maximum degree-normalized Laplacian Lmax and the BetheHessian H(r ) can detect communities in the detectable region, while the random walk Laplacian Lrw and the augmented normalized Laplacian Laug cannot detect the community when (cin − cout )/2 is small (i.e., weakly modular) even in the detectable region. The same is true for k > 2 communities. This suggests that SybilBelief and SybilSCAR-C perform well on sparse, degree heterogeneous, strong modular graphs. To explain the reason for the high community detectability of Lmax and H(r ), we have to review the regularization of a matrix. In the previous studies [40], [41], a method called regularized spectral clustering (RSC) was proposed to improve spectral clustering by using the regularized Laplacian −1/2 −1/2 Lτ = I − D τ A Dτ or Lτ = I − D−1 τ A instead of the random walk Laplacian Lrw . Here, Dτ := D + τ I is a regularized degree matrix and τ ∈ R≥0 is a regularization parameter. For a graph generated by DCSBM, Qin et al. [41] proved that the upper bound of the clustering error of RSC is proportional to 1/(dmin + τ ) and 1/λ̄2k , where λ̄k is the k smallest eigenvalue of the expectation of Lτ , Lτ = E[Lτ ]. For small τ (i.e., insufficient regularization), 1/(dmin + τ ) becomes large, and conversely, for large τ (i.e., excessive regularization), 1/λ̄k becomes large. Therefore, this result suggests that the clustering error of RSC can be minimized by setting appropriate τ . Qin et al. [41] proposed τ = dave as a suitable choice. Indeed, as shown in Fig. 4, the community detectability of Lτ with τ = dave is comparable to that of the Bethe-Hessian, which has the best performance. Given the above, the results in Fig. 4 can be explained as follows. Defining the diagonal matrix b I as [b I ]ii = 1 if b i ∈ Vb ∪ Vs and [ I ]ii = 0 otherwise, the augmented normalized Laplacian is represented as Laug = I −( D+b I)−1 A and can be regarded as being insufficiently and unevenly regularized. The maximum degree-normalized Laplacian can be regarded as being excessively and unevenly regularized since it can be rewritten as Lmax = I − ( D + Ddiff )−1 A with [Ddiff ]ii := dmax − di . From (19), Bethe-Hessian has the following relationship with the regularized Laplacian: H(r )v = λv ⇔ (I − r ( D + (r 2 − λ − 1)I)−1 A)v = 0 r −1 ⇔ Lr 2 −λ−1 v = v. r Hence, the spectrum of the Bethe-Hessian and the regularized Laplacian are closely related. C. SybilHeat The above analysis reveals that the detection performance of each method depends on the effectiveness of the low-pass filtering. More specifically, for a Sybil detection method to perform well, 1) the low-pass filter kernel h(λ) must properly emphasize (remove) low (high) frequency components, and 2) low frequency eigenvectors of the shift matrix S must have high community detectability. On the basis of this result, we propose a novel Sybil detection method (SybilHeat) with the filter kernel h(λ) = e−sλ (s ≥ 0) and the shift matrix −1/2 −1/2 S = Lτ = I − Dτ A Dτ (τ = dave ) that satisfy the 1232 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 18, 2023 above two requirements. The filter kernel h(λ) = e−sλ is called the heat kernel, and the larger the scaling parameter s ≥ 0 is, the more strongly high frequency components are reduced. The eigenvectors of Lτ have high community detectability comparable to the Bethe-Hessian, as described above. For given prior reputation score q, SybilHeat calculates the posterior reputation score p as −s Lτ p = V τ e−s3τ V −1 q = h(Lτ ) q, τ =e (24) where 3τ and V τ are matrices consisting of eigenvalues and eigenvectors of Lτ , respectively. Although all eigenvalues and eigenvectors are needed to naively compute h(Lτ ), the time complexity of the eigenvalue decomposition is O(N 3 ), which is not scalable for large N . However, fortunately, the Chebyshev polynomial approximation [42] enables us to avoid computing the eigenvalue decomposition and approximately compute (24). Specifically, by using the (shifted) Chebyshev polynomials {T̃k (λ)}∞ k=0 defined on the range λ ∈ [0, 2], we approximate h(Lτ )q as K X c̃0 c̃k T̃k (Lτ )q, h(Lτ )q ≈ q + 2 TABLE II C OMPARISON OF THE T IME C OMPLEXITY OF E ACH M ETHOD (25) k=1 where K is the approximation order. The Chebyshev coefficient c̃k (k = 0, 1, . . . ) is calculated by Z 2 π c̃k = h(cos θ + 1) cos(kθ)dθ. (26) π 0 By definition of the Chebyshev polynomials, T̃0 (Lτ ) = I, T̃1 (Lτ ) = Lτ − I, and, for k ≥ 2, T̃k (Lτ ) satisfies T̃k (Lτ ) = 2(Lτ − I)T̃k−1 (Lτ ) − T̃k−2 (Lτ ). This indicates that a vector T̃k (Lτ )q in (25) can be recursively computed from T̃k−1 (Lτ )q and T̃k−2 (Lτ )q. Dominant in this computational cost is the matrix-vector multiplication of Lτ , and its time complexity is O(|E|) [42]. Thus, the overall time complexity of (25) is O(K |E|). Since social networks are often sparse (i.e., |E| ∝ N ), SybilHeat is applicable to large social networks. For comparison, we present the time complexity of each method in Table II, based on the complexity analysis of prior works [8], [10], [11], [15]. Note that t is the number of iterations. Considering that log N b is almost equal to |E|, we see is very small and that | E| that all methods have almost the same scalability (linear time complexity with respect to |E|) theoretically. However, as shown in Section VI-D, the empirical runtimes of each method are different. VI. E MPIRICAL E VALUATIONS In the previous section, on the basis of the low-pass filtering interpretation of existing Sybil detection methods, we explained the reasons for the superiority or inferiority of the performance and the requirements for high performance of each method. In addition, we proposed SybilHeat on the basis of our analysis. In this section, we validate our analysis and evaluate the detection performance and noise robustness of SybilHeat through experiments on synthetic graphs and realworld social networks. Furthremore, we evaluate scalability in terms of the empirical runtime on synthetic graphs. A. Experimental Setup We use synthetic graphs and some real-world social networks for the experiments. Synthetic graphs are generated by the SBM and DCSBM with average degree dave = 5 and k = 2 even-sized communities. We assume that one community is the benign region and the other is the Sybil region. We also assume that 10% of nodes randomly selected from the benign region are labeled benign nodes and 10% of nodes randomly selected from the Sybil region are labeled Sybil nodes. We also use real social networks with community structure that are benchmark datasets for community detection tasks for evaluating the detection performance and noise robustness: Zachary karate club [43] (34 nodes and 78 edges), dolphin social network [44] (62 nodes and 159 edges), American colledge football [45] (115 nodes and 613 edges), and political blogs [46] (1224 nodes and 33430 edges). For evaluation, we used the largest connected component of each graph, with half of the communities as benign regions and the rest as Sybil regions. The max(3, ⌊0.1N ⌋) nodes randomly selected from the benign region were labeled benign nodes, and the max(3, ⌊0.1N ⌋) nodes randomly selected from the Sybil region were labeled Sybil nodes. We compare SybilHeat with CIA, SybilRank, SybilWalk, SybilSCAR-C, and SybilBelief. SybilSCAR-D is excluded because its convergence is not stable. The experimental parameters for each method were set as: the restart parameter α = 0.85 for CIA, the number of iteration 0 = ⌊log N ⌋ for SybilRank, the residual prior probability θ = 0.5 for SybilSCAR, and the scaling parameter s = 8 for SybilHeat. B. Detection Performance Since the Sybil detection method provides a ranking for each node such that Sybil nodes are ranked higher than benign nodes [47], we adopt the Area Under the Receiver Operating Characteristic Curve (AUC) to evaluate detection performance. The AUC of a method is the probability that a (randomly selected) Sybil node ranks higher than a benign node, and the AUC is 1.0 if all Sybil nodes rank higher than benign nodes. If all nodes are ranked uniformly at random, the AUC is 0.5. Figure 5 shows the detection performance of each method with respect to the community strength |cin − cout |/2 on synthetic graphs generated by the SBM (left) and DCSBM (right). We observe the following results from this figure. First, the detection performance of all methods increases as FURUTANI et al.: INTERPRETING GRAPH-BASED SYBIL DETECTION METHODS AS LOW-PASS FILTERING 1233 TABLE III D ETECTION P ERFORMANCE ON R EAL S OCIAL N ETWORKS . N UMBERS IN B OLD I NDICATE THE B EST P ERFORMANCE AND U NDERLINED O NES A RE THE S ECOND B EST Fig. 5. Detection performance of CIA, SybilRank, SybilWalk, SybilSCAR, SybilBelief, and SybilHeat for sparse modular graphs generated by SBM (left) and DCSBM (right). The vertical dotted line shows the detectability threshold. Parameters are N = 1000, k = 2, dave = 5, and θi = 1 for SBM and θi ∼ [U (3, 7)]3 for DCSBM. Simulations are averaged over 100 runs. Fig. 6. Noise robustness of CIA, SybilRank, SybilWalk, SybilSCAR, SybilBelief, and SybilHeat for sparse modular graphs generated by SBM (left) and DCSBM (right). Parameters are N = 1000, k = 2, dave = 5, cout = 0.5, and θi = 1 for SBM and θi ∼ [U (3, 7)]3 for DCSBM. Simulations are averaged over 100 runs. the modularity (i.e., the difference between intra- and intercommunity connectivity) increases. Second, random walkbased methods (CIA, SybilRank, and SybilWalk) perform worse for DCSBM graphs than for SBM graphs. This is due to the low community detection performance of the shift matrices corresponding to these methods for DCSBM graphs, as shown in Fig. 4. Moreover, SybilBelief shows the best performance in the detectable region, while in the undetectable region, its detection performance drops sharply. In other words, SybilBelief performs well only on strongly modular graphs. This is due to the fact that SybilBelief relies on only the k small eigenvectors of the Bethe-Hessian (which have no information about the community structure in the undetectable region) to perform detection. On the other hand, SybilHeat performs consistently better over the two regions than the other methods. That is, SybilHeat performs better next to SybilBelief in the detectable region and performs best in the undetectable region, comparable to SybilWalk and SybilSCAR-C. against label noise because the corresponding filter can completely remove high frequency components, while CIA and SybilRank are not robust because of insufficient low-pass filtering. SybilHeat has higher noise robustness than other methods except SybilBelief. This is because the filter kernel h(λ) = e−sλ corresponding to SybilHeat greatly removes high frequency components. In particular, in the low noise range (ε ≤ 0.1), SybilHeat performs almost no worse than in the noiseless case (ε = 0.0). Although SybilWalk and SybilSCAR-C have the same filter kernel h(λ) = 1/λ, SybilSCAR-C has lower noise robustness than SybilWalk. This is because the contribution of high frequency components cannot be neglected since eigenvalues of the shift matrix corresponding to SybilSCAR-C on sparse graphs are aggregated around λ = 1 (i.e., low and high frequency eigenvalues of Lmax are close to each other), as shown in Figure 3. Table III shows the detection performance of each method for ε = 0.0, 0.1, and 0.2 on real-world social networks. As in the results on synthetic graphs, SybilBelief is robust to label noise on all datasets. SybilWalk and SybilHeat are next to SybilBelief in robustness. However, SybilHeat performs more consistently than SybilWalk because the performance of SybilWalk varies with the data, as shown in the results for polblogs. C. Robustness to Label Noise In practice, training datasets may contain noise due to human error [48]. That is, some labeled benign (Sybil) nodes are actually Sybil (benign). To evaluate the noise robustness of each method, we compare their detection performance when the node labels of training data with a fraction ε (≤ 0.5) are flipped. Figure 6 shows the detection performance of each method with respect to the fraction of label noise on synthetic graphs. First, as discussed in Section V-A, SybilBelief is quite robust D. Scalability We evaluate scalability in terms of the runtime of each method for varying the number of edges in synthetic graphs. Note that all methods are iterative algorithms, so their runtimes are highly dependent on the number of iterations t. To avoid 1234 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 18, 2023 node features and edge weights will improve detection performance. We hope that this study leads to a deeper theoretical understanding and further improvement of graph-based Sybil detection methods. A PPENDIX L INEARIZATION AND F ILTERING I NTERPRETATION OF L OOPY B ELIEF P ROPAGATION Fig. 7. Runtimes of CIA, SybilRank, SybilWalk, SybilSCAR, SybilBelief, and SybilHeat with respect to the number of edges. bias due to the number of iterations, we run these methods with the same number of iterations (t = 20). For SybilHeat, we set the approximate order K = 20. Figure 7 shows runtimes of SybilHeat and the compared methods with respect to the number of edges. All methods have linear time complexity with respect to the number of edges, which is consistent with the analysis in Section V-C. Furthermore, CIA, SybilRank, SybilWalk, and SybilSCAR are almost equally scalable, whereas SybilBelief requires more runtime than the other methods. Similar results are reported in the previous work [10], [15]. SybilHeat requires slightly more runtime than CIA, SybilRank, SybilWalk, and SybilSCAR, but is more scalable than SybilBelief. VII. C ONCLUSION We have shown that existing graph-based Sybil detection methods can be interpreted in a unified framework of low-pass filtering. According to this interpretation, the performance of a Sybil detection method depends on how good the low-pass filtering is. In other words, for a Sybil detection method to perform well, 1) the filter kernel h(λ) must properly emphasize (remove) low (high) frequency components, and 2) the low frequency eigenvectors of the shift matrix S must have high community detectability. Therefore, we have compared and analyzed existing detection methods from two perspectives (filter kernel properties and spectrum of the shift matrices) and have provided theoretical explanations of the superiority or inferiority of the performance and the conditions for high performance of each method. Furthermore, we proposed the Sybil detection method (called SybilHeat) with the heat kernel as the filter kernel and the regularized Laplacian as the shift matrix, which satisfies the above two requirements. SybilHeat is applicable to large social networks because it can be approximated by the Chebyshev polynomial approximation in the linear order with respect to the number of edges. Numerical experiments show that SybilHeat performs consistently better than other methods on graphs with various structural properties. Although we proposed a novel Sybil detection method using heat kernel and regularized Laplacian as the filter kernel and shift matrix, respectively, the performance might be improved by using other better filter kernels or shift matrices. Also, as stated in existing studies [9], [12], [13], [14], learning We first explain the linearization of loopy belief propagation by an approach presented in [49]. We respectively define the node potential function and edge potential function as φi (si ) = exp(βθi si ) and ψi j (si , s j ) = exp(β Ji j si s j ) where β is the inverse temperature, θi is the local magnetic field on node i, and Ji j is the interaction strength between node i and node j. Note that, the definitions of potential functions in (15) and (16) can be the above definitions such P recovered by normalizing P that si φi (si ) = 1 and si ,s j ψi j (si , s j ) = 1. Let us introduce the one-parametrized message νi j := tanh−1 (µi j (+1) − µi j (−1)) instead of the message µi j (s j ). For simplicity, denoting µi+j := µi j (+1) and µi−j := µi j (−1), we obtain tanh(νi j )    Y Y 1  β Ji j −βθi  e − e−β Ji j eβθi µ+ µ− = ki − e ki Zi j k∈∂i\ j k∈∂i\ j Y Y −βθi eβθi µ+ µ− ki − e ki  k∈∂i\ j k∈∂i\ j = tanh β Ji j . Y Y − −βθi eβθi µ+ + e µ ki ki k∈∂i\ j k∈∂i\ j Since log Y µ+ ki k∈∂i\ j µ− ki !1 2 = + − X 1 µ+ + µ− ki + µki − µki log ki − + − 2 µ+ ki + µki − µki + µki k∈∂ i\ j − X 1 1 + (µ+ ki − µki ) = log + 2 1 − (µki + µ− ki ) k∈∂i\ j X X − = tanh−1 (µ+ − µ ) = νki , ki ki k∈∂i\ j k∈∂i\ j by using νi j , we can rewrite (17) as     X  νki  . tanh νinew = tanh β Ji j tanh βθi + j (27) k∈∂i\ j The approximate marginal distribution bi (si ) can also be oneparametrized by its expectation (called magnetization) m i = ⟨si ⟩ = bi (1) − bi (−1) as follows: Y Y −βθi eβθi µ∞ µ∞ ki (1) − e ki (−1) mi = k∈∂i eβθi Y k∈∂i −βθi µ∞ ki (1) + e k∈∂i Y µ∞ ki (−1) k∈∂i ! = tanh βθi + X k∈∂i ∞ νki . (28) FURUTANI et al.: INTERPRETING GRAPH-BASED SYBIL DETECTION METHODS AS LOW-PASS FILTERING ⃗ Denoting ν = (νi j ) ∈ R| E| , let BP : ν 7 → ν new be the nonlinear operator that maps ν to ν new by following (27). The ⃗ ⃗ element of the Jacobian matrix B = BP ′ (ν) ∈ R| E|×| E| of BP is given by   2 tanh(β J ) 1 − tanh (h ) i j i\ j ∂νinew j = δil (1 − δ jk ), Bi j,kl = ∂νkl 1 − tanh2 (β Ji j ) tanh2 (h i\ j ) (29) P where h i\ j := βθi + k∈∂i\ j νki . The matrix B is called a non-backtracking matrix and its (i j, kl)-element takes a nonzero value if two directed edges are consecutive (i.e., i = l) but do not back track (i.e., j ̸ = k). To simplify the analysis, we assume the vanishing local field condition (i.e., θi = 0 for all i ∈ V ). This means that there is no prior information for each node. In this case, we have Bi j,kl = tanh(β Ji j )δil (1−δ jk ) and thus (27) can be written as ν new ≈ Bν by the linearization around the trivial fixed point ν ∗ = 0. Hence, when the spectral radius ρ(B) < 1, the message ν converges to the trivial fixed point ν ∗ . On the other hand, when ρ(B) > 1, ν leaves from ν ∗ . The eigenvector associated with an eigenvalue larger than 1 is expected to correspond approximately to non-trivial (hopefully informative) fixed points of the loopy belief propagation [50]. We consider unweighted non-backtracking matrix below (i.e., Bi j,kl = δil (1 − δ jk )). For small |νki |, the magnetization of each node is given by ! X X ∞ ∞ m i = tanh νki ≈ νki . (30) k∈∂i k∈∂i The eigenvector ν associated with an eigenvalue η > 1 satisfies X X ηνi j = Bi j,kl νkl = νki = m i − ν ji . (31) (k,l)∈ E⃗ k∈∂i\ j Similarly, ην ji = m j − νi j holds. Thus, we have νi j = (ηm i − m j )/(η2 − 1). By substituting this into (30), we obtain X (η2 − 1)m i + |∂i|m i − η m k = 0. (32) k∈∂i Alternatively, we rewrite the above equation as H(η)m = 0 by using the Bethe-Hessian matrix. The small eigenvalues of H(r ) are closely related to the informative eigenvalues of B, and the corresponding eigenvectors approximately give the magnetization m calculated by the loopy belief propagation [38], [50]. On the basis of this observation, a community detection algorithm using eigenvectors corresponding to small (low frequency) eigenvalues of the Bethe-Hessian matrix has been proposed and its performance has been demonstrated to be comparable to loopy belief propagation [38], [51], [52]. In the same spirit, we can interpret SybilBelief as low-pass filtering as in (20) using the ideal low-pass filter kernel g(ω) that extracts only the low frequency spectrum of H(r ). Note that we have assumed the vanishing local field condition through our analysis. This assumption is quite strong since it implies ignoring all known node labels. However, since the local magnetic field helps accelerate the convergence of the message and biases it toward the desired local minimum [53], 1235 a similar effect is expected by low-pass filtering of the known label vector q. R EFERENCES [1] B. Auxier and M. Anderson, “Social media use in 2021,” Pew Res. Center, vol. 1, pp. 1–4, Apr. 2021. [2] D. A. Broniatowski et al., “Weaponized health communication: Twitter bots and Russian trolls amplify the vaccine debate,” Amer. J. Public Health, vol. 108, no. 10, pp. 1378–1384, 2018. [3] J.-P. Allem and E. Ferrara, “Could social bots pose a threat to public health?” Amer. J. Public Health, vol. 108, no. 8, p. 1005, 2018. [4] A. Bessi and E. Ferrara, “Social bots distort the 2016 U.S. Presidential election online discussion,” 1st Monday, vol. 21, nos. 7–11, Nov. 2016. [5] M. T. Bastos and D. Mercea, “The Brexit botnet and user-generated hyperpartisan news,” Social Sci. Comput. Rev., vol. 37, no. 1, pp. 38–54, Feb. 2019. [6] L. Alvisi, A. Clement, A. Epasto, S. Lattanzi, and A. Panconesi, “SoK: The evolution of Sybil defense via social networks,” in Proc. IEEE Symp. Secur. Privacy, May 2013, pp. 382–396. [7] C. Yang, R. Harkreader, J. Zhang, S. Shin, and G. Gu, “Analyzing spammers’ social networks for fun and profit: A case study of cyber criminal ecosystem on Twitter,” in Proc. 21st Int. Conf. WWW, 2012, pp. 71–80. [8] Q. Cao, M. Sirivianos, X. Yang, and T. Pregueiro, “Aiding the detection of fake accounts in large scale social online services,” in Proc. 9th USENIX Symp. Networked Syst. Design Implement., 2012, pp. 197–210. [9] Y. Boshmaf et al., “Íntegro: Leveraging victim prediction for robust fake account detection in large scale OSNs,” Comput. Secur., vol. 61, pp. 142–168, Jan. 2016. [10] J. Jia, B. Wang, and N. Z. Gong, “Random walk based fake account detection in online social networks,” in Proc. 47th Annu. IEEE/IFIP Int. Conf. Dependable Syst. Netw. (DSN), Jun. 2017, pp. 273–284. [11] N. Z. Gong, M. Frank, and P. Mittal, “Sybilbelief: A semi-supervised learning approach for structure-based Sybil detection,” IEEE Trans. Inf. Forensics Security, vol. 9, no. 6, pp. 976–987, Jun. 2014. [12] H. Fu, X. Xie, Y. Rui, N. Z. Gong, G. Sun, and E. Chen, “Robust spammer detection in microblogs: Leveraging user carefulness,” ACM Trans. Intell. Syst. Technol., vol. 8, no. 6, pp. 1–31, 2017. [13] P. Gao, B. Wang, N. Z. Gong, S. R. Kulkarni, K. Thomas, and P. Mittal, “SYBILFUSE: Combining local attributes with global structure to perform robust Sybil detection,” in Proc. IEEE Conf. Commun. Netw. Secur. (CNS), May 2018, pp. 1–9. [14] A. Dorri, M. Abadi, and M. Dadfarnia, “SocialBotHunter: Botnet detection in Twitter-like social networking services using semi-supervised collective classification,” in Proc. IEEE 16th Int. Conf. Dependable, Autonomic Secure Comput., 16th Int. Conf. Pervasive Intell. Comput., 4th Int. Conf. Big Data Intell. Comput. Cyber Sci. Technol. Congr. (DASC/PiCom/DataCom/CyberSciTech), Aug. 2018, pp. 496–503. [15] B. Wang, J. Jia, L. Zhang, and N. Z. Gong, “Structure-based Sybil detection in social networks via local rule-based propagation,” IEEE Trans. Netw. Sci. Eng., vol. 6, no. 3, pp. 523–537, Jul. 2019. [16] S. Furutani, T. Shibahara, K. Hato, M. Akiyama, and M. Aida, “Sybil detection as graph filtering,” in Proc. IEEE Global Commun. Conf., Dec. 2020, pp. 1–6. [17] H. Yu, M. Kaminsky, P. B. Gibbons, and A. Flaxman, “SybilGuard: Defending against Sybil attacks via social networks,” in Proc. Conf. Appl., Technol., Archit., Protocols Comput. Commun., Aug. 2006, pp. 267–278. [18] H. Yu, P. B. Gibbons, M. Kaminsky, and F. Xiao, “SybilLimit: A nearoptimal social network defense against Sybil attacks,” in Proc. IEEE Symp. Secur. Privacy, May 2008, pp. 3–17. [19] G. Danezis and P. Mittal, “SybilInfer: Detecting Sybil nodes using social networks,” in Proc. 16th Annu. Netw. Dist. Syst. Secur. Symp. San Diego, CA, USA, 2009, pp. 1–15. [20] A. H. Wang, “Don’t follow me: Spam detection in Twitter,” in Proc. Int. Conf. Secur. Cryptogr., Jul. 2010, pp. 1–10. [21] K. Lee, J. Caverlee, and S. Webb, “Uncovering social spammers: Social honeypots + machine learning,” in Proc. 33rd Int. ACM SIGIR Conf. Res. Develop. Inf. Retr., Jul. 2010, pp. 435–442. [22] C. Yang, R. C. Harkreader, and G. Gu, “Die free or live hard? Empirical evaluation and new design for fighting evolving Twitter spammers,” in Proc. 14th Int. Workshop Recent Adv. Intrusion Detect. Cham, Switzerland: Springer, 2011, pp. 318–337. 1236 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 18, 2023 [23] A. A. Amleshwaram, N. Reddy, S. Yadav, G. Gu, and C. Yang, “CATS: Characterizing automation of Twitter spammers,” in Proc. 5th Int. Conf. Commun. Syst. Netw. (COMSNETS), Jan. 2013, pp. 1–10. [24] B. Wang, A. Zubiaga, M. Liakata, and R. Procter, “Making the most of tweet-inherent features for social spam detection on Twitter,” 2015, arXiv:1503.07405. [25] X. Zheng, Z. Zeng, Z. Chen, Y. Yu, and C. Rong, “Detecting spammers on social networks,” Neurocomputing, vol. 159, pp. 27–34, Jul. 2015. [26] G. Jethava and U. P. Rao, “User behavior-based and graph-based hybrid approach for detection of Sybil attack in online social networks,” Comput. Electr. Eng., vol. 99, Apr. 2022, Art. no. 107753. [27] D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst, “The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains,” IEEE Signal Process. Mag., vol. 30, no. 3, pp. 83–98, May 2012. [28] A. Ortega, P. Frossard, J. Kovačevič, J. M. F. Moura, and P. Vandergheynst, “Graph signal processing: Overview, challenges, and applications,” Proc. IEEE, vol. 106, no. 5, pp. 808–828, May 2018. [29] P. W. Holland, K. B. Laskey, and S. Leinhardt, “Stochastic blockmodels: First steps,” Social Netw., vol. 5, no. 2, pp. 109–137, 1983. [30] A. Decelle, F. Krzakala, C. Moore, and L. Zdeborová, “Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications,” Phys. Rev. E, Stat. Phys. Plasmas Fluids Relat. Interdiscip. Top., vol. 84, Dec. 2011, Art. no. 066106. [31] E. Mossel, J. Neeman, and A. Sly, “Stochastic block models and reconstruction,” 2012, arXiv:1202.1499. [32] L. Massoulié, “Community detection thresholds and the weak Ramanujan property,” in Proc. 46th Annu. ACM Symp. Theory Comput., May 2014, pp. 694–703. [33] B. Karrer and M. E. J. Newman, “Stochastic blockmodels and community structure in networks,” Phys. Rev. E, Stat. Phys. Plasmas Fluids Relat. Interdiscip. Top., vol. 83, Jan. 2011, Art. no. 016107. [34] L. Gulikers, M. Lelarge, and L. Massoulié, “An impossibility result for reconstruction in the degree-corrected stochastic block model,” Ann. Appl. Probab., vol. 28, no. 5, pp. 3002–3027, Oct. 2018. [35] B. Wang, J. Jia, and N. Zhenqiang Gong, “Graph-based security and privacy analytics via collective classification with joint weight learning and propagation,” 2018, arXiv:1812.01661. [36] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Burlington, MA, USA: Morgan Kaufmann, 1988. [37] K. Murphy, Y. Weiss, and M. I. Jordan, “Loopy belief propagation for approximate inference: An empirical study,” 2013, arXiv:1301.6725. [38] A. Saade, F. Krzakala, and L. Zdeborová, “Spectral clustering of graphs with the Bethe Hessian,” in Proc. Adv. Neural Inf. Process. Syst., vol. 27, 2014, pp. 1–15. [39] U. von Luxburg, “A tutorial on spectral clustering,” Statist. Comput., vol. 17, no. 4, pp. 395–416, Dec. 2007. [40] K. Chaudhuri, F. Chung, and A. Tsiatas, “Spectral clustering of graphs with general degrees in the extended planted partition model,” in Proc. 25th Annu. Conf. Learn. Theory, 2012, pp. 1–35. [41] T. Qin and K. Rohe, “Regularized spectral clustering under the degreecorrected stochastic blockmodel,” in Proc. Adv. Neural Inf. Process. Syst., vol. 26, 2013, pp. 1–13. [42] D. K. Hammond, P. Vandergheynst, and R. Gribonval, “Wavelets on graphs via spectral graph theory,” Appl. Comput. Harmon. Anal., vol. 30, no. 2, pp. 129–150, Mar. 2011. [43] W. W. Zachary, “An information flow model for conflict and fission in small groups,” J. Anthropol. Res., vol. 33, no. 4, pp. 452–473, Dec. 1977. [44] D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, and S. M. Dawson, “The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations,” Behav. Ecol. Sociobiol., vol. 54, no. 4, pp. 396–405 2003. [45] M. Girvan and M. E. J. Newman, “Community structure in social and biological networks,” Proc. Nat. Acad. Sci. USA, vol. 99, no. 12, pp. 7821–7826, Apr. 2002. [46] L. A. Adamic and N. Glance, “The political blogosphere and the 2004 U.S. election: Divided they blog,” in Proc. 3rd Int. Workshop Link Discovery, 2005, pp. 36–43. [47] B. Viswanath, A. Post, K. P. Gummadi, and A. Mislove, “An analysis of social network-based Sybil defenses,” ACM SIGCOMM Comput. Commun. Rev., vol. 40, no. 4, pp. 363–374, 2010. [48] G. A. Wang et al., “Social turing tests: Crowdsourcing Sybil detection,” in Proc. 20th Annu. Netw. Dist. Syst. Secur. Symp., 2013, pp. 1–16. [49] J. M. Mooij and H. J. Kappen, “On the properties of the Bethe approximation and loopy belief propagation on binary networks,” J. Stat. Mech., Theory Exp., vol. 2005, no. 11, Nov. 2005, Art. no. P11012. [50] A. Saade, “Spectral inference methods on sparse graphs: Theory and applications,” 2016, arXiv:1610.04337. [51] L. Dall’Amico, R. Couillet, and N. Tremblay, “Revisiting the Bethe-Hessian: Improved community detection in sparse heterogeneous graphs,” in Proc. Adv. Neural Inf. Process. Syst., vol. 32, 2019, pp. 1–12. [52] L. Dall’Amico, R. Couillet, and N. Tremblay, “Optimal Laplacian regularization for sparse spectral community detection,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), May 2020, pp. 3237–3241. [53] C. Knoll and F. Pernkopf, “On loopy belief propagation—Local stability analysis for non-vanishing fields,” in Proc. 33rd Conf. Uncertainty Artif. Intell., 2017, pp. 1–10. Satoshi Furutani received the B.E. and M.E. degrees in system design engineering from Tokyo Metropolitan University, Japan, in 2016 and 2018, respectively, where he is currently pursuing the Ph.D. degree. Since joining Nippon Telegraph and Telephone Corporation (NTT) in 2018, he has been engaged in research and development on network science. He is currently a Researcher with NTT Social Informatics Laboratories. His research interests include analysis of social network dynamics and graph signal processing. He is a member of IPSJ and IEICE. Toshiki Shibahara received the B.E. degree in engineering and the M.E degree in information science and technology from The University of Tokyo, Tokyo, Japan, in 2012 and 2014, respectively, and the Ph.D. degree in information science and technology from Osaka University, Osaka, Japan, in 2020. Since joining Nippon Telegraph and Telephone Corporation (NTT) in 2014, he has been engaged in research on cyber security and machine learning. He is currently a Researcher with NTT Social Informatics Laboratories, Tokyo. Mitsuaki Akiyama (Member, IEEE) received the M.E. and Ph.D. degrees in information science from the Nara Institute of Science and Technology, Japan, in 2007 and 2013, respectively. Since joining Nippon Telegraph and Telephone Corporation (NTT) in 2007, he has been engaged in research and development on cybersecurity. He is currently a Senior Distinguished Researcher at NTT Social Informatics Laboratories. His research interests include cybersecurity measurement, offensive security, and usable security and privacy. He is a Senior Member of IPSJ and a member of IEICE. He received the Cybersecurity Encouragement Award of the Minister for Internal Affairs and Communications in 2020, the ISOC NDSS 2020 Distinguished Paper Award in 2020, and the IPSJ/IEEE Computer Society Young Computer Researcher Award in 2022. Masaki Aida (Senior Member, IEEE) received the B.S. degree in physics and the M.S. degree in atomic physics from Saint Paul’s University, Tokyo, Japan, in 1987 and 1989, respectively, and the Ph.D. degree in telecommunications engineering from The University of Tokyo, Japan, in 1999. In April 1989, he joined NTT Laboratories. From April 2005 to March 2007, he was an Associate Professor at the Faculty of Systems Design, Tokyo Metropolitan University. He has been a Professor of the Graduate School of Systems Design, Tokyo Metropolitan University, since April 2007. His current research interests include analysis of social network dynamics and distributed control of computer communication networks. He is a fellow of IEICE and a member of ACM and ORSJ. He received the Best Tutorial Paper Award and the Best Paper Award from the IEICE Communications Society in 2013 and 2016, respectively, and IEICE 100-Year Memorial Paper Award in 2017.