Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Bootstrapped Graph Diffusions: Exposing the Power of Nonlinearity

Published: 12 June 2018 Publication History

Abstract

Graph-based semi-supervised learning (SSL) algorithms predict labels for all nodes based on provided labels of a small set of seed nodes. Classic methods capture the graph structure through some underlying diffusion process that propagates through the graph edges. Spectral diffusion, which includes personalized page rank and label propagation, propagates through random walks. Social diffusion propagates through shortest paths. These diffusions are linear in the sense of not distinguishing between contributions of few "strong" relations or many "weak'' relations. Recent methods such as node embeddings and graph convolutional networks (GCN) attained significant gains in quality for SSL tasks. These methods vary on how the graph structure, seed label information, and other features are used, but do share a common thread of nonlinearity that suppresses weak relations and re-enforces stronger ones. Aiming for quality gain with more scalable methods, we revisit classic linear diffusion methods and place them in a self-training framework. The resulting bootstrapped diffusions are nonlinear in that they re-enforce stronger relations, as with the more complex methods. Surprisingly, we observe that SSL with bootstrapped diffusions not only significantly improves over the respective non-bootstrapped baselines but also outperform state-of-the-art SSL methods. Moreover, since the self-training wrapper retains the scalability of the base method, we obtain both higher quality and better scalability.

References

[1]
S. Abney. 2004. Understanding the Yarowsky Algorithm. Comput. Linguist. Vol. 30, 3 (Sept. 2004).
[2]
J. Atwood and D. Towsley. 2016. Diffusion-Convolutional Neural Networks. In NIPS.
[3]
A. Bavelas. 1948. A mathematical model for small group structures. Human Organization Vol. 7 (1948), 16--30.
[4]
F. Bloch and M. O. Jackson. 2007. The formation of networks with transfers among players. Journal of Economic Theory Vol. 133, 1 (2007), 83--110.
[5]
A. Blum and S. Chawla. 2001. Learning from Labeled and Unlabeled Data Using Graph Mincuts ICML.
[6]
A. Blum, J. Lafferty, M. R. Rwebangira, and R. Reddy. 2004. Semi-supervised Learning Using Randomized Mincuts. In ICML.
[7]
E. Buchnik and E. Cohen. 2018. Bootstrapped Graph Diffusions: Exposing the Power of Nonlinearity. Proc. ACM Meas. Anal. Comput. Syst. Vol. 2, 1 (April. 2018).
[8]
O. Chapelle, B. Schölkopf, and A. Zien. 2006. Semi-supervised learning. MIT Press.
[9]
F. R. K. Chung. 1997. Spectral Graph Theory. American Mathematical Society.
[10]
E. Cohen. 1997. Size-Estimation Framework with Applications to Transitive Closure and Reachability. J. Comput. System Sci. Vol. 55 (1997), 441--453.
[11]
E. Cohen. 2016. Semi-Supervised Learning on Graphs through Reach and Distance Diffusion. CoRR Vol. abs/1603.09064 (2016). http://arxiv.org/abs/1603.09064
[12]
E. Cohen, D. Delling, F. Fuchs, A. Goldberg, M. Goldszmidt, and R. Werneck. 2013. Scalable Similarity Estimation in Social Networks: Closeness, Node Labels, and Random Edge Lengths. In COSN. ACM.
[13]
E. Cohen, D. Delling, T. Pajor, and R. F. Werneck. 2015. Distance-Based Influence in Networks: Computation and Maximization. Technical Report cs.SI/1410.6976. arXiv. http://arxiv.org/abs/1410.06976
[14]
E. Cohen and H. Kaplan. 2007. Spatially-decaying aggregation over a network: Model and algorithms. J. Comput. System Sci. Vol. 73 (2007), 265--288. Full version of a SIGMOD 2004 paper.
[15]
A. Condon and R. M. Karp. 2001. Algorithms for Graph Partitioning on the Planted Partition Model. Random Struct. Algorithms Vol. 18, 2 (2001).
[16]
M. Defferrard, X. Bresson, and P. Vandergheynst. 2016. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering NIPS.
[17]
N. Du, L. Song, M. Gomez-Rodriguez, and H. Zha. 2013. Scalable Influence Estimation in Continuous-Time Diffusion Networks. In NIPS.
[18]
L. C. Freeman. 1979. Centrality in social networks: Conceptual clarification. Social Networks Vol. 1 (1979).
[19]
M. Gomez-Rodriguez, J. Leskovec, and A. Krause. 2010. Inferring Networks of Diffusion and Influence. In KDD.
[20]
A. Grover and J. Leskovec. 2016. node2vec: Scalable Feature Learning for Networks. In KDD. ACM.
[21]
M. Henaff, J. Bruna, and Y. LeCun. 2015. Deep Convolutional Networks on Graph-Structured Data. CoRR Vol. abs/1506.05163 (2015). http://arxiv.org/abs/1506.05163
[22]
T. Joachims. 1999. Transductive Inference for Text Classification Using Support Vector Machines ICML.
[23]
D. Kempe, J. M. Kleinberg, and É. Tardos. 2003. Maximizing the spread of influence through a social network KDD. ACM.
[24]
T. N. Kipf and M. Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks ICLR.
[25]
F. Lin and W. W. Cohen. 2008. The MultiRank Bootstrap Algorithm: Self-Supervised Political Blog Classification and Ranking Using Semi-Supervised Link Classification ICWSM.
[26]
M. H. Malewicz, G.and Austern, A.J.C Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. 2010. Pregel: a system for large-scale graph processing. In SIGMOD. ACM.
[27]
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality NIPS.
[28]
S. Nandanwar and N. N. Murty. 2016. Structural Neighborhood Based Classification of Nodes in a Network KDD. ACM.
[29]
T. Opsahl, F. Agneessens, and J. Skvoretz. 2010. Node centrality in weighted networks: Generalizing degree and shortest paths. Social Networks Vol. 32 (2010). http://toreopsahl.com/2010/03/20/
[30]
B. Perozzi, R. Al-Rfou, and S. Skiena. 2014. DeepWalk: Online Learning of Social Representations KDD. ACM.
[31]
S. Ravi and Q. Diao. 2016. Large-Scale Semi-Supervised Learning Using Streaming Approximation AISTATS.
[32]
G. Sabidussi. 1966. The centrality index of a graph. Psychometrika Vol. 31, 4 (1966), 581--603.
[33]
H. J. Scudder. 1965. Probability of error of some adaptive pattern-recognition machines. IEEE Transactions on Information Theory Vol. 11 (1965).
[34]
A. Subramanya and P. P. Talukdar. 2014. Graph-based semi-supervised learning. Morgan & Claypool.
[35]
M. Whitney and A. Sarkar. 2012. Bootstrapping via Graph Propagation. In ACL.
[36]
Z. Yang, W. W. Cohen, and R. Salakhutdinov. 2016. Revisiting Semi-Supervised Learning with Graph Embeddings ICML. JMLR.org.
[37]
D. Yarowsky. 1995. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods ACL.
[38]
D. Zhou, O. Bousquet, T. Lal, J. Weston, and B. Schölkopf. 2004. Learning with Local and Global Consistency. In NIPS.
[39]
X. Zhu and Z. Ghahramani. 2002. Learning from labeled and unlabeled data with label propagation. (2002). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.3864
[40]
X. Zhu, Z. Ghahramani, and J. Laffery. 2003. Semi-supervised learning using Gaussian fields and harmonic functions ICML.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGMETRICS Performance Evaluation Review
ACM SIGMETRICS Performance Evaluation Review  Volume 46, Issue 1
SIGMETRICS '18
June 2018
142 pages
ISSN:0163-5999
DOI:10.1145/3292040
Issue’s Table of Contents
  • cover image ACM Conferences
    SIGMETRICS '18: Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems
    June 2018
    155 pages
    ISBN:9781450358460
    DOI:10.1145/3219617
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2018
Published in SIGMETRICS Volume 46, Issue 1

Check for updates

Author Tags

  1. bootstrapping
  2. graph-based semi-supervised learning
  3. label propagation

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Nov 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media