research-article

Rethinking independent cross-entropy loss for graph-structured data

AUTHORs:

Xin WangAuthors Info & Claims

ICML'24: Proceedings of the 41st International Conference on Machine Learning

Article No.: 1448, Pages 35570 - 35589

Published: 21 July 2024 Publication History

Abstract

Graph neural networks (GNNs) have exhibited prominent performance in learning graph-structured data. Considering node classification task, based on the i.i.d assumption among node labels, the traditional supervised learning simply sums up cross-entropy losses of the independent training nodes and applies the average loss to optimize GNNs' weights. But different from other data formats, the nodes are naturally connected. It is found that the independent distribution modeling of node labels restricts GNNs' capability to generalize over the entire graph and defend adversarial attacks. In this work, we propose a new framework, termed joint-cluster supervised learning, to model the joint distribution of each node with its corresponding cluster. We learn the joint distribution of node and cluster labels conditioned on their representations, and train GNNs with the obtained joint loss. In this way, the data-label reference signals extracted from the local cluster explicitly strengthen the discrimination ability on the target node. The extensive experiments demonstrate that our joint-cluster supervised learning can effectively bolster GNNs' node classification accuracy. Furthermore, being benefited from the reference signals which may be free from spiteful interference, our learning paradigm significantly protects the node classification from being affected by the adversarial attack.

References

[1]

Bo, D., Wang, X., Shi, C., Zhu, M., Lu, E., and Cui, P. Structural deep clustering network. In Proceedings of the web conference 2020, pp. 1400-1410, 2020.

Digital Library

[2]

Bojchevski, A. and Gunnemann, S. Deep gaussian embedding of graphs: Unsupervised inductive learning via ranking. arXiv preprint arXiv:1707.03815, 2017.

[3]

Chen, H., Zhou, K., Lai, K.-H., Hu, X., Wang, F., and Yang, H. Adversarial graph perturbations for recommendations at scale. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1854-1858, 2022.

Digital Library

[4]

Chen, L., Li, J., Peng, J., Xie, T., Cao, Z., Xu, K., He, X., Zheng, Z., and Wu, B. A survey of adversarial learning on graphs. arXiv preprint arXiv:2003.05730, 2020a.

[5]

Chen, M., Wei, Z., Huang, Z., Ding, B., and Li, Y. Simple and deep graph convolutional networks. In International conference on machine learning, pp. 1725-1735. PMLR, 2020b.

Digital Library

[6]

Chiang, W.-L., Liu, X., Si, S., Li, Y., Bengio, S., and Hsieh, C.-J. Cluster-gcn: An efficient algorithm for training deep and large graph convolutional networks. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 257-266, 2019.

Digital Library

[7]

Defferrard, M., Bresson, X., and Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. Advances in neural information processing systems, 29, 2016.

Digital Library

[8]

Diao, C., Zhou, K., Huang, X., and Hu, X. Molcpt: Molecule continuous prompt tuning to generalize molecular representation learning. arXiv preprint arXiv:2212.10614, 2022.

[9]

Duan, K., Liu, Z., Wang, P., Zheng, W., Zhou, K., Chen, T., Hu, X., and Wang, Z. A comprehensive study on large-scale graph training: Benchmarking and rethinking. arXiv preprint arXiv:2210.07494, 2022.

[10]

Fan, W., Ma, Y., Li, Q., He, Y., Zhao, E., Tang, J., and Yin, D. Graph neural networks for social recommendation. In The world wide web conference, pp. 417-426, 2019.

Digital Library

[11]

Frasca, F., Rossi, E., Eynard, D., Chamberlain, B., Bronstein, M., and Monti, F. Sign: Scalable inception graph neural networks. arXiv preprint arXiv:2004.11198, 2020.

[12]

Gao, H., Pei, J., and Huang, H. Conditional random field enhanced graph convolutional neural networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 276-284, 2019.

Digital Library

[13]

Gasteiger, J., Bojchevski, A., and Gunnemann, S. Predict then propagate: Graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997, 2018.

[14]

Grover, A. and Leskovec, J. Node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '16, pp. 855-864, New York, NY, USA, 2016. Association for Computing Machinery.

Digital Library

[15]

Guo, C., Pleiss, G., Sun, Y., and Weinberger, K. Q. On calibration of modern neural networks. In International conference on machine learning, pp. 1321-1330. PMLR, 2017.

[16]

Guo, K., Zhou, K., Hu, X., Li, Y., Chang, Y., and Wang, X. Orthogonal graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp. 3996-4004, 2022.

[17]

Hamilton, W. L., Ying, R., and Leskovec, J. Inductive representation learning on large graphs. arXiv preprint arXiv:1706.02216, 2017.

[18]

He, X., Deng, K., Wang, X., Li, Y., Zhang, Y., and Wang, M. Lightgcn: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, pp. 639-648, 2020.

Digital Library

[19]

Hu, W., Fey, M., Zitnik, M., Dong, Y., Ren, H., Liu, B., Catasta, M., and Leskovec, J. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems, 33:22118-22133, 2020.

[20]

Huang, Q., He, H., Singh, A., Lim, S.-N., and Benson, A. R. Combining label propagation and simple models out-performs graph neural networks. arXiv preprint arXiv:2010.13993, 2020.

[21]

Jin, W., Li, Y., Xu, H., Wang, Y., and Tang, J. Adversarial attacks and defenses on graphs: A review and empirical study. arXiv preprint arXiv:2003.00653, 10 (3447556.3447566), 2020a.

[22]

Jin, W., Ma, Y., Liu, X., Tang, X., Wang, S., and Tang, J. Graph structure learning for robust graph neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 66-74, 2020b.

Digital Library

[23]

Karypis, G. and Kumar, V. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on scientific Computing, 20(1):359-392, 1998.

Digital Library

[24]

Kipf, T. N. and Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. In Proceedings of the 5th International Conference on Learning Representations, 2017.

[25]

Langley, P. Crafting papers on machine learning. In Langley, P. (ed.), Proceedings of the 17th International Conference on Machine Learning (ICML 2000), pp. 1207-1216, Stanford, CA, 2000. Morgan Kaufmann.

[26]

Lee, J., Lee, I., and Kang, J. Self-attention graph pooling. In International conference on machine learning, pp. 3734-3743. PMLR, 2019.

[27]

Li, J., Zhang, H., Han, Z., Rong, Y., Cheng, H., and Huang, J. Adversarial attack on community detection by hiding individuals. In Proceedings of The Web Conference 2020, pp. 917-927, 2020.

Digital Library

[28]

Liu, G., Huang, X., and Yi, X. Adversarial label poisoning attack on graph neural networks via label propagation. In Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part V, pp. 227-243. Springer, 2022a.

[29]

Liu, Y., Yang, X., Zhou, S., Liu, X., Wang, Z., Liang, K., Tu, W., Li, L., Duan, J., and Chen, C. Hard sample aware network for contrastive deep graph clustering. arXiv preprint arXiv:2212.08665, 2022b.

[30]

Ma, J., Deng, J., and Mei, Q. Adversarial attack on graph neural networks as an influence maximization problem. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, pp. 675-685, 2022.

Digital Library

[31]

Ma, T., Xiao, C., Shang, J., and Sun, J. Cgnf: Conditional graph neural fields. 2018.

[32]

Min, S., Lewis, M., Zettlemoyer, L., and Hajishirzi, H. Metaicl: Learning to learn in context. arXiv preprint arXiv:2110.15943, 2021.

[33]

Naeini, M. P., Cooper, G., and Hauskrecht, M. Obtaining well calibrated probabilities using bayesian binning. In Proceedings of the AAAI conference on artificial intelligence, volume 29, 2015.

[34]

Pei, H., Wei, B., Chang, K. C.-C., Lei, Y., and Yang, B. Geom-gcn: Geometric graph convolutional networks. arXiv preprint arXiv:2002.05287, 2020.

[35]

Perozzi, B., Al-Rfou, R., and Skiena, S. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 701-710, 2014.

Digital Library

[36]

Qu, M., Cai, H., and Tang, J. Neural structured prediction for inductive node classification. arXiv preprint arXiv:2204.07524, 2022.

[37]

Rozemberczki, B. and Sarkar, R. Characteristic functions on graphs: Birds of a feather, from statistical descriptors to parametric models. In Proceedings of the 29th ACM international conference on information & knowledge management, pp. 1325-1334, 2020.

Digital Library

[38]

Rozemberczki, B., Allen, C., and Sarkar, R. Multi-scale attributed node embedding. Journal of Complex Networks, 9(2):cnab014, 2021a.

[39]

Rozemberczki, B., Allen, C., and Sarkar, R. Multi-scale attributed node embedding. Journal of Complex Networks, 9(2):cnab014, 2021b.

[40]

Sen, P., Namata, G., Bilgic, M., Getoor, L., Galligher, B., and Eliassi-Rad, T. Collective classification in network data. AI magazine, 29(3):93-93, 2008.

Digital Library

[41]

Shen, X., Lio, P., Yang, L., Yuan, R., Zhang, Y., and Peng, C. Graph rewiring and preprocessing for graph neural networks based on effective resistance. IEEE Transactions on Knowledge and Data Engineering, 2024a.

Digital Library

[42]

Shen, X., Wang, Y., Zhou, K., Pan, S., and Wang, X. Optimizing ood detection in molecular graphs: A novel approach with diffusion models. arXiv preprint arXiv:2404.15625, 2024b.

[43]

Shi, Y., Huang, Z., Feng, S., Zhong, H., Wang, W., and Sun, Y. Masked label prediction: Unified message passing model for semi-supervised classification. arXiv preprint arXiv:2009.03509, 2020.

[44]

Spielman, D. A. and Teng, S.-H. A local clustering algorithm for massive graphs and its application to nearly linear time graph partitioning. SIAM Journal on computing, 42(1):1-26, 2013.

Digital Library

[45]

Sun, Y., Wang, S., Tang, X., Hsieh, T.-Y., and Honavar, V. Adversarial attacks on graph neural networks via node injections: A hierarchical reinforcement learning approach. In Proceedings of the Web Conference 2020, pp. 673-683, 2020.

Digital Library

[46]

Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.

[47]

Van der Maaten, L. and Hinton, G. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.

[48]

Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., and Bengio, Y. Graph Attention Networks. International Conference on Learning Representations, 2018.

[49]

Wang, B. and Gong, N. Z. Attacking graph-based classification via manipulating the graph structure. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 2023-2040, 2019.

Digital Library

[50]

Wang, H. and Leskovec, J. Unifying graph convolutional neural networks and label propagation. arXiv preprint arXiv:2002.06755, 2020.

[51]

Wang, H. and Leskovec, J. Combining graph convolutional neural networks and label propagation. ACM Transactions on Information Systems (TOIS), 40(4):1-27, 2021.

[52]

Wang, Y., Zhou, K., Miao, R., Liu, N., and Wang, X. Adagcl: Adaptive subgraph contrastive learning to generalize large-scale graph training. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pp. 2046-2055, 2022.

Digital Library

[53]

Wang, Y., Zhou, K., Liu, N., Wang, Y., and Wang, X. Efficient sharpness-aware minimization for molecular graph transformer models. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=Od39h4XQ3Y.

[54]

Weigend, A. On overfitting and the effective number of hidden units. In Proceedings of the 1993 connectionist models summer school, volume 1, pp. 335-342, 1994.

[55]

Wu, F., Souza, A., Zhang, T., Fifty, C., Yu, T., and Weinberger, K. Simplifying graph convolutional networks. In International conference on machine learning, pp. 6861-6871. PMLR, 2019.

[56]

Xie, B., Chang, H., Wang, X., Bian, T., Zhou, S., Wang, D., Zhang, Z., and Zhu, W. Revisiting adversarial attacks on graph neural networks for graph classification. arXiv preprint arXiv:2208.06651, 2022a.

[57]

Xie, T., Wang, B., and Kuo, C.-C. J. Graphhop: An enhanced label propagation method for node classification. IEEE Transactions on Neural Networks and Learning Systems, 2022b.

[58]

Xu, K., Hu, W., Leskovec, J., and Jegelka, S. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018.

[59]

Xu, K., Chen, H., Liu, S., Chen, P.-Y., Weng, T.-W., Hong, M., and Lin, X. Topology attack and defense for graph neural networks: An optimization perspective. arXiv preprint arXiv:1906.04214, 2019.

[60]

Yang, L., Wang, C., Gu, J., Cao, X., and Niu, B. Why do attributes propagate in graph convolutional neural networks? In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp. 4590-4598, 2021.

[61]

Yang, L., Kang, L., Zhang, Q., Li, M., He, D., Wang, Z., Wang, C., Cao, X., Guo, Y., et al. Open: Orthogonal propagation with ego-network modeling. Advances in Neural Information Processing Systems, 35:9249-9261, 2022.

[62]

Yang, L., Zhang, Q., Shi, R., Zhou, W., Niu, B., Wang, C., Cao, X., He, D., Wang, Z., and Guo, Y. Graph neural networks without propagation. In Proceedings of the ACM Web Conference 2023, pp. 469-477, 2023.

Digital Library

[63]

Zeng, H., Zhou, H., Srivastava, A., Kannan, R., and Prasanna, V. Graphsaint: Graph sampling based inductive learning method. arXiv preprint arXiv:1907.04931, 2019.

[64]

Zhang, M., Wang, X., Zhu, M., Shi, C., Zhang, Z., and Zhou, J. Robust heterogeneous graph neural networks against adversarial attacks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp. 4363-4370, 2022.

[65]

Zhou, K., Huang, X., Li, Y., Zha, D., Chen, R., and Hu, X. Towards deeper graph neural networks with differentiable group normalization. Advances in neural information processing systems, 33:4917-4928, 2020.

[66]

Zhou, K., Huang, X., Zha, D., Chen, R., Li, L., Choi, S.-H., and Hu, X. Dirichlet energy constrained learning for deep graph neural networks. Advances in Neural Information Processing Systems, 34:21834-21846, 2021a.

[67]

Zhou, K., Song, Q., Huang, X., Zha, D., Zou, N., and Hu, X. Multi-channel graph neural networks. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, pp. 1352-1358, 2021b.

Digital Library

[68]

Zhou, K., Choi, S.-H., Liu, Z., Liu, N., Yang, F., Chen, R., Li, L., and Hu, X. Adaptive label smoothing to regularize large-scale graph training. In Proceedings of the 2023 SIAM International Conference on Data Mining (SDM), pp. 55-63. SIAM, 2023.

[69]

Zhuo, J., Cui, C., Fu, K., Niu, B., He, D., Wang, C., Guo, Y., Wang, Z., Cao, X., and Yang, L. Graph contrastive learning reimagined: Exploring universality. In Proceedings of the ACM on Web Conference 2024, pp. 641-651, 2024.

Digital Library

Index Terms

Rethinking independent cross-entropy loss for graph-structured data

Index terms have been assigned to the content through auto-classification.

Recommendations

Semi-supervised Multi-label Learning for Graph-structured Data
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

The semi-supervised multi-label classification problem primarily deals with Euclidean data, such as text with a 1D grid of tokens and images with a 2D grid of pixels. However, the non-Euclidean graph-structured data naturally and constantly appears in ...
Robust Semi-supervised Representation Learning for Graph-Structured Data
Advances in Knowledge Discovery and Data Mining
Abstract
The success of machine learning algorithms generally depends on data representation and recently many representation learning methods have been proposed. However, learning a good representation may not always benefit the classification tasks. It ...
Semisupervised, multilabel, multi-instance learning for structured data

Many classification tasks require both labeling objects and determining label associations for parts of each object. Example applications include labeling segments of images or determining relevant parts of a text document when the training labels are ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

ICML'24: Proceedings of the 41st International Conference on Machine Learning

July 2024

63010 pages

Copyright © 2024.

Publisher

JMLR.org

Publication History

Published: 21 July 2024

Qualifiers

Research-article
Research
Refereed limited

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Table of Conten