Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3692070.3693518guideproceedingsArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

Rethinking independent cross-entropy loss for graph-structured data

Published: 21 July 2024 Publication History

Abstract

Graph neural networks (GNNs) have exhibited prominent performance in learning graph-structured data. Considering node classification task, based on the i.i.d assumption among node labels, the traditional supervised learning simply sums up cross-entropy losses of the independent training nodes and applies the average loss to optimize GNNs' weights. But different from other data formats, the nodes are naturally connected. It is found that the independent distribution modeling of node labels restricts GNNs' capability to generalize over the entire graph and defend adversarial attacks. In this work, we propose a new framework, termed joint-cluster supervised learning, to model the joint distribution of each node with its corresponding cluster. We learn the joint distribution of node and cluster labels conditioned on their representations, and train GNNs with the obtained joint loss. In this way, the data-label reference signals extracted from the local cluster explicitly strengthen the discrimination ability on the target node. The extensive experiments demonstrate that our joint-cluster supervised learning can effectively bolster GNNs' node classification accuracy. Furthermore, being benefited from the reference signals which may be free from spiteful interference, our learning paradigm significantly protects the node classification from being affected by the adversarial attack.

References

[1]
Bo, D., Wang, X., Shi, C., Zhu, M., Lu, E., and Cui, P. Structural deep clustering network. In Proceedings of the web conference 2020, pp. 1400-1410, 2020.
[2]
Bojchevski, A. and Gunnemann, S. Deep gaussian embedding of graphs: Unsupervised inductive learning via ranking. arXiv preprint arXiv:1707.03815, 2017.
[3]
Chen, H., Zhou, K., Lai, K.-H., Hu, X., Wang, F., and Yang, H. Adversarial graph perturbations for recommendations at scale. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1854-1858, 2022.
[4]
Chen, L., Li, J., Peng, J., Xie, T., Cao, Z., Xu, K., He, X., Zheng, Z., and Wu, B. A survey of adversarial learning on graphs. arXiv preprint arXiv:2003.05730, 2020a.
[5]
Chen, M., Wei, Z., Huang, Z., Ding, B., and Li, Y. Simple and deep graph convolutional networks. In International conference on machine learning, pp. 1725-1735. PMLR, 2020b.
[6]
Chiang, W.-L., Liu, X., Si, S., Li, Y., Bengio, S., and Hsieh, C.-J. Cluster-gcn: An efficient algorithm for training deep and large graph convolutional networks. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 257-266, 2019.
[7]
Defferrard, M., Bresson, X., and Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. Advances in neural information processing systems, 29, 2016.
[8]
Diao, C., Zhou, K., Huang, X., and Hu, X. Molcpt: Molecule continuous prompt tuning to generalize molecular representation learning. arXiv preprint arXiv:2212.10614, 2022.
[9]
Duan, K., Liu, Z., Wang, P., Zheng, W., Zhou, K., Chen, T., Hu, X., and Wang, Z. A comprehensive study on large-scale graph training: Benchmarking and rethinking. arXiv preprint arXiv:2210.07494, 2022.
[10]
Fan, W., Ma, Y., Li, Q., He, Y., Zhao, E., Tang, J., and Yin, D. Graph neural networks for social recommendation. In The world wide web conference, pp. 417-426, 2019.
[11]
Frasca, F., Rossi, E., Eynard, D., Chamberlain, B., Bronstein, M., and Monti, F. Sign: Scalable inception graph neural networks. arXiv preprint arXiv:2004.11198, 2020.
[12]
Gao, H., Pei, J., and Huang, H. Conditional random field enhanced graph convolutional neural networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 276-284, 2019.
[13]
Gasteiger, J., Bojchevski, A., and Gunnemann, S. Predict then propagate: Graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997, 2018.
[14]
Grover, A. and Leskovec, J. Node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '16, pp. 855-864, New York, NY, USA, 2016. Association for Computing Machinery.
[15]
Guo, C., Pleiss, G., Sun, Y., and Weinberger, K. Q. On calibration of modern neural networks. In International conference on machine learning, pp. 1321-1330. PMLR, 2017.
[16]
Guo, K., Zhou, K., Hu, X., Li, Y., Chang, Y., and Wang, X. Orthogonal graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp. 3996-4004, 2022.
[17]
Hamilton, W. L., Ying, R., and Leskovec, J. Inductive representation learning on large graphs. arXiv preprint arXiv:1706.02216, 2017.
[18]
He, X., Deng, K., Wang, X., Li, Y., Zhang, Y., and Wang, M. Lightgcn: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, pp. 639-648, 2020.
[19]
Hu, W., Fey, M., Zitnik, M., Dong, Y., Ren, H., Liu, B., Catasta, M., and Leskovec, J. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems, 33:22118-22133, 2020.
[20]
Huang, Q., He, H., Singh, A., Lim, S.-N., and Benson, A. R. Combining label propagation and simple models out-performs graph neural networks. arXiv preprint arXiv:2010.13993, 2020.
[21]
Jin, W., Li, Y., Xu, H., Wang, Y., and Tang, J. Adversarial attacks and defenses on graphs: A review and empirical study. arXiv preprint arXiv:2003.00653, 10 (3447556.3447566), 2020a.
[22]
Jin, W., Ma, Y., Liu, X., Tang, X., Wang, S., and Tang, J. Graph structure learning for robust graph neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 66-74, 2020b.
[23]
Karypis, G. and Kumar, V. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on scientific Computing, 20(1):359-392, 1998.
[24]
Kipf, T. N. and Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. In Proceedings of the 5th International Conference on Learning Representations, 2017.
[25]
Langley, P. Crafting papers on machine learning. In Langley, P. (ed.), Proceedings of the 17th International Conference on Machine Learning (ICML 2000), pp. 1207-1216, Stanford, CA, 2000. Morgan Kaufmann.
[26]
Lee, J., Lee, I., and Kang, J. Self-attention graph pooling. In International conference on machine learning, pp. 3734-3743. PMLR, 2019.
[27]
Li, J., Zhang, H., Han, Z., Rong, Y., Cheng, H., and Huang, J. Adversarial attack on community detection by hiding individuals. In Proceedings of The Web Conference 2020, pp. 917-927, 2020.
[28]
Liu, G., Huang, X., and Yi, X. Adversarial label poisoning attack on graph neural networks via label propagation. In Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part V, pp. 227-243. Springer, 2022a.
[29]
Liu, Y., Yang, X., Zhou, S., Liu, X., Wang, Z., Liang, K., Tu, W., Li, L., Duan, J., and Chen, C. Hard sample aware network for contrastive deep graph clustering. arXiv preprint arXiv:2212.08665, 2022b.
[30]
Ma, J., Deng, J., and Mei, Q. Adversarial attack on graph neural networks as an influence maximization problem. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, pp. 675-685, 2022.
[31]
Ma, T., Xiao, C., Shang, J., and Sun, J. Cgnf: Conditional graph neural fields. 2018.
[32]
Min, S., Lewis, M., Zettlemoyer, L., and Hajishirzi, H. Metaicl: Learning to learn in context. arXiv preprint arXiv:2110.15943, 2021.
[33]
Naeini, M. P., Cooper, G., and Hauskrecht, M. Obtaining well calibrated probabilities using bayesian binning. In Proceedings of the AAAI conference on artificial intelligence, volume 29, 2015.
[34]
Pei, H., Wei, B., Chang, K. C.-C., Lei, Y., and Yang, B. Geom-gcn: Geometric graph convolutional networks. arXiv preprint arXiv:2002.05287, 2020.
[35]
Perozzi, B., Al-Rfou, R., and Skiena, S. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 701-710, 2014.
[36]
Qu, M., Cai, H., and Tang, J. Neural structured prediction for inductive node classification. arXiv preprint arXiv:2204.07524, 2022.
[37]
Rozemberczki, B. and Sarkar, R. Characteristic functions on graphs: Birds of a feather, from statistical descriptors to parametric models. In Proceedings of the 29th ACM international conference on information & knowledge management, pp. 1325-1334, 2020.
[38]
Rozemberczki, B., Allen, C., and Sarkar, R. Multi-scale attributed node embedding. Journal of Complex Networks, 9(2):cnab014, 2021a.
[39]
Rozemberczki, B., Allen, C., and Sarkar, R. Multi-scale attributed node embedding. Journal of Complex Networks, 9(2):cnab014, 2021b.
[40]
Sen, P., Namata, G., Bilgic, M., Getoor, L., Galligher, B., and Eliassi-Rad, T. Collective classification in network data. AI magazine, 29(3):93-93, 2008.
[41]
Shen, X., Lio, P., Yang, L., Yuan, R., Zhang, Y., and Peng, C. Graph rewiring and preprocessing for graph neural networks based on effective resistance. IEEE Transactions on Knowledge and Data Engineering, 2024a.
[42]
Shen, X., Wang, Y., Zhou, K., Pan, S., and Wang, X. Optimizing ood detection in molecular graphs: A novel approach with diffusion models. arXiv preprint arXiv:2404.15625, 2024b.
[43]
Shi, Y., Huang, Z., Feng, S., Zhong, H., Wang, W., and Sun, Y. Masked label prediction: Unified message passing model for semi-supervised classification. arXiv preprint arXiv:2009.03509, 2020.
[44]
Spielman, D. A. and Teng, S.-H. A local clustering algorithm for massive graphs and its application to nearly linear time graph partitioning. SIAM Journal on computing, 42(1):1-26, 2013.
[45]
Sun, Y., Wang, S., Tang, X., Hsieh, T.-Y., and Honavar, V. Adversarial attacks on graph neural networks via node injections: A hierarchical reinforcement learning approach. In Proceedings of the Web Conference 2020, pp. 673-683, 2020.
[46]
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.
[47]
Van der Maaten, L. and Hinton, G. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
[48]
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., and Bengio, Y. Graph Attention Networks. International Conference on Learning Representations, 2018.
[49]
Wang, B. and Gong, N. Z. Attacking graph-based classification via manipulating the graph structure. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 2023-2040, 2019.
[50]
Wang, H. and Leskovec, J. Unifying graph convolutional neural networks and label propagation. arXiv preprint arXiv:2002.06755, 2020.
[51]
Wang, H. and Leskovec, J. Combining graph convolutional neural networks and label propagation. ACM Transactions on Information Systems (TOIS), 40(4):1-27, 2021.
[52]
Wang, Y., Zhou, K., Miao, R., Liu, N., and Wang, X. Adagcl: Adaptive subgraph contrastive learning to generalize large-scale graph training. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pp. 2046-2055, 2022.
[53]
Wang, Y., Zhou, K., Liu, N., Wang, Y., and Wang, X. Efficient sharpness-aware minimization for molecular graph transformer models. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=Od39h4XQ3Y.
[54]
Weigend, A. On overfitting and the effective number of hidden units. In Proceedings of the 1993 connectionist models summer school, volume 1, pp. 335-342, 1994.
[55]
Wu, F., Souza, A., Zhang, T., Fifty, C., Yu, T., and Weinberger, K. Simplifying graph convolutional networks. In International conference on machine learning, pp. 6861-6871. PMLR, 2019.
[56]
Xie, B., Chang, H., Wang, X., Bian, T., Zhou, S., Wang, D., Zhang, Z., and Zhu, W. Revisiting adversarial attacks on graph neural networks for graph classification. arXiv preprint arXiv:2208.06651, 2022a.
[57]
Xie, T., Wang, B., and Kuo, C.-C. J. Graphhop: An enhanced label propagation method for node classification. IEEE Transactions on Neural Networks and Learning Systems, 2022b.
[58]
Xu, K., Hu, W., Leskovec, J., and Jegelka, S. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018.
[59]
Xu, K., Chen, H., Liu, S., Chen, P.-Y., Weng, T.-W., Hong, M., and Lin, X. Topology attack and defense for graph neural networks: An optimization perspective. arXiv preprint arXiv:1906.04214, 2019.
[60]
Yang, L., Wang, C., Gu, J., Cao, X., and Niu, B. Why do attributes propagate in graph convolutional neural networks? In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp. 4590-4598, 2021.
[61]
Yang, L., Kang, L., Zhang, Q., Li, M., He, D., Wang, Z., Wang, C., Cao, X., Guo, Y., et al. Open: Orthogonal propagation with ego-network modeling. Advances in Neural Information Processing Systems, 35:9249-9261, 2022.
[62]
Yang, L., Zhang, Q., Shi, R., Zhou, W., Niu, B., Wang, C., Cao, X., He, D., Wang, Z., and Guo, Y. Graph neural networks without propagation. In Proceedings of the ACM Web Conference 2023, pp. 469-477, 2023.
[63]
Zeng, H., Zhou, H., Srivastava, A., Kannan, R., and Prasanna, V. Graphsaint: Graph sampling based inductive learning method. arXiv preprint arXiv:1907.04931, 2019.
[64]
Zhang, M., Wang, X., Zhu, M., Shi, C., Zhang, Z., and Zhou, J. Robust heterogeneous graph neural networks against adversarial attacks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp. 4363-4370, 2022.
[65]
Zhou, K., Huang, X., Li, Y., Zha, D., Chen, R., and Hu, X. Towards deeper graph neural networks with differentiable group normalization. Advances in neural information processing systems, 33:4917-4928, 2020.
[66]
Zhou, K., Huang, X., Zha, D., Chen, R., Li, L., Choi, S.-H., and Hu, X. Dirichlet energy constrained learning for deep graph neural networks. Advances in Neural Information Processing Systems, 34:21834-21846, 2021a.
[67]
Zhou, K., Song, Q., Huang, X., Zha, D., Zou, N., and Hu, X. Multi-channel graph neural networks. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, pp. 1352-1358, 2021b.
[68]
Zhou, K., Choi, S.-H., Liu, Z., Liu, N., Yang, F., Chen, R., Li, L., and Hu, X. Adaptive label smoothing to regularize large-scale graph training. In Proceedings of the 2023 SIAM International Conference on Data Mining (SDM), pp. 55-63. SIAM, 2023.
[69]
Zhuo, J., Cui, C., Fu, K., Niu, B., He, D., Wang, C., Guo, Y., Wang, Z., Cao, X., and Yang, L. Graph contrastive learning reimagined: Exploring universality. In Proceedings of the ACM on Web Conference 2024, pp. 641-651, 2024.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICML'24: Proceedings of the 41st International Conference on Machine Learning
July 2024
63010 pages

Publisher

JMLR.org

Publication History

Published: 21 July 2024

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media