Graph Adaptive Attention Network with Cross-Entropy
Abstract
:1. Introduction
- (1)
- To generate different weights for each neighbor node of the central node, we design the novel adaptive attention mechanism (AAM).
- (2)
- Based on the AAM, we utilize Multi-head Graph Convolution (MHGC) to model and represent features better.
- (3)
- We adopt a cross-entropy loss function to model the bias between the predicted values and the ground truth, which greatly improves the classification accuracy.
2. Related Work
2.1. Spectral-Based Methods
2.2. Spatial-Based Methods
3. Methodology
3.1. Overall
3.2. Graph Layers with AAM
3.3. Cross Entropy Loss
- (1)
- Encoding features: encode the node features of the input graph to obtain the initial feature representation of the nodes.
- (2)
- Attention mechanism: in the hidden layer, use the attention mechanism to weigh the average of neighboring nodes to obtain the updated node feature representation .
- (3)
- Fully connected layer: in the output layer, the updated node feature representation is transformed into a fully connected transformation and the softmax function is applied to obtain the predicted probability .
- (4)
- Calculate the loss: use the cross-entropy loss function to calculate the difference between the true category distribution and the predicted probability distribution , and the average to obtain the overall loss .
4. Experiments
4.1. Datasets
4.2. Ablation Experiments
4.3. Comparison with Other Methods
5. Conclusions and Future Work
Funding
Data Availability Statement
Conflicts of Interest
References
- Thomas, K.; Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Yao, X.; Yang, H.; Sheng, M. Feature Fusion Based on Graph Convolution Network for Modulation Classification in Underwater Communication. Entropy 2023, 25, 1096. [Google Scholar] [CrossRef] [PubMed]
- Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
- Zaremba, W.; Sutskever, I.; Vinyals, O. Recurrent neural network regularization. arXiv 2014, arXiv:1409.2329. [Google Scholar]
- Graves, A. Long short-term memory. In Supervised Sequence Labelling with Recurrent Neural Networks; Springer: Berlin, Heidelberg, 2012; pp. 37–45. [Google Scholar]
- Bruna, J.; Zaremba, W.; Szlam, A.; LeCun, Y. Spectral networks and locally connected networks on graphs. arXiv 2013, arXiv:1312.6203. [Google Scholar]
- Defferrard, M.; Bresson, X.; Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: New York, NY, USA, 2016; Volume 29. [Google Scholar]
- Hammond, D.K.; Vandergheynst, P.; Gribonval, R. Wavelets on graphs via spectral graph theory. Appl. Comput. Harmon. Anal. 2011, 30, 129–150. [Google Scholar] [CrossRef]
- Henaff, M.; Bruna, J.; LeCun, Y. Deep convolutional networks on graph-structured data. arXiv 2015, arXiv:1506.05163. [Google Scholar]
- Levie, R.; Monti, F.; Bresson, X.; Bronstein, M.M. Cayleynets: Graph convolutional neural networks with complex rational spectral filters. IEEE Trans. Signal Process. 2018, 67, 97–109. [Google Scholar] [CrossRef]
- Shuman, D.I.; Narang, S.K.; Frossard, P.; Ortega, A.; Vandergheynst, P. The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Process. Mag. 2013, 30, 83–98. [Google Scholar] [CrossRef]
- Bianchi, F.M.; Grattarola, D.; Livi, L.; Alippi, C. Graph neural networks with convolutional arma filters. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 3496–3507. [Google Scholar] [CrossRef] [PubMed]
- Defferrard, M.; Milani, F.; Gusset, F.; Perraudin, N. Deep Networks on Toric Graphs. arXiv 2018, arXiv:1808.03965. [Google Scholar]
- Spielman, D. Spectral graph theory. Comb. Sci. Comput. 2012, 18, 18. [Google Scholar]
- Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: New York, NY, USA, 2017; Volume 30. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Monti, F.; Bronstein, M.; Bresson, X. Geometric matrix completion with recurrent multi-graph neural networks. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: New York, NY, USA, 2017; Volume 30. [Google Scholar]
- Monti, F.; Boscaini, D.; Masci, J.; Rodola, E.; Svoboda, J.; Bronstein, M.M. Geometric deep learning on graphs and manifolds using mixture model cnns. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5115–5124. [Google Scholar]
- Wu, F.; Souza, A.; Zhang, T.; Fifty, C.; Yu, T.; Weinberger, K. Simplifying graph convolutional networks. In Proceedings of the 36th International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 10–15 June 2019. [Google Scholar]
- Xu, K.; Li, C.; Tian, Y.; Sonobe, T.; Kawarabayashi, K.I.; Jegelka, S. Representation learning on graphs with jumping knowledge networks. In Proceedings of the 35th International Conference on Machine Learning, PMLR, Stockholm, Sweden, 10–15 July 2018. [Google Scholar]
- Xu, K.; Hu, W.; Leskovec, J.; Jegelka, S. How powerful are graph neural networks? arXiv 2018, arXiv:1810.00826. [Google Scholar]
- Liao, R.; Zhao, Z.; Urtasun, R.; Zemel, R.S. Lanczosnet: Multi-scale deep graph convolutional networks. arXiv 2019, arXiv:1901.01484. [Google Scholar]
- Yan, S.; Xiong, Y.; Lin, D. Spatial temporal graph convolutional networks for skeleton-based action recognition. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. No. 1. [Google Scholar]
- Atwood, J.; Towsley, D. Diffusion-convolutional neural networks. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: New York, NY, USA, 2016; Volume 29. [Google Scholar]
- Taud, H.; Mas, J.F. Multilayer perceptron (MLP). In Geomatic Approaches for Modeling Land Change Scenarios; Springer: Cham, Switzerland, 2018; pp. 451–455. [Google Scholar]
- Weston, J.; Ratle, F.; Collobert, R. Deep learning via semi-supervised embedding. In Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, 5–9 July 2008. [Google Scholar]
- Perozzi, B.; Al-Rfou, R.; Skiena, S. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014. [Google Scholar]
- Bandyopadhyay, S.; Maulik, U.; Holder, L.B.; Cook, D.J. Link-based classification. In Advanced Methods for Knowledge Discovery from Complex Data; Springer: Dordrecht, The Netherlands, 2005; pp. 189–207. [Google Scholar]
- Yang, Z.; Cohen, W.; Salakhudinov, R. Revisiting semi-supervised learning with graph embeddings. In Proceedings of the 33rd International Conference on Machine Learning, PMLR, New York, NY, USA, 19–24 June 2016. [Google Scholar]
- Mohamadi, Y.; Chehreghani, M.H. Strong Transitivity Relations and Graph Neural Networks. arXiv 2024, arXiv:2401.01384. [Google Scholar]
Dataset | Nodes | Edges | Features per Node | Classes |
---|---|---|---|---|
Cora | 2708 | 5429 | 1433 | 7 |
Citeseer | 3312 | 4723 | 3703 | 7 |
Pubmed | 19,717 | 44,338 | 500 | 3 |
AAM | MHGC (n = 8) | n_hidden | Accuracy (%) | |||
---|---|---|---|---|---|---|
64 | 96 | 128 | ||||
Group1 | × | × | √ | × | × | 83.9 |
× | × | × | √ | × | 83.5 | |
× | × | × | × | √ | 83 | |
Group2 | × | √ | √ | × | × | 84.6 |
× | √ | × | √ | × | 85.1 | |
× | √ | × | × | √ | 84.6 | |
Group3 | √ | × | √ | × | × | 84.9 |
√ | × | × | √ | × | 84.3 | |
√ | × | × | × | √ | 84.6 | |
Group4 | √ | √ | √ | × | × | 84.6 |
√ | √ | × | √ | × | 85.4 | |
√ | √ | × | × | √ | 85.6 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, Z. Graph Adaptive Attention Network with Cross-Entropy. Entropy 2024, 26, 576. https://doi.org/10.3390/e26070576
Chen Z. Graph Adaptive Attention Network with Cross-Entropy. Entropy. 2024; 26(7):576. https://doi.org/10.3390/e26070576
Chicago/Turabian StyleChen, Zhao. 2024. "Graph Adaptive Attention Network with Cross-Entropy" Entropy 26, no. 7: 576. https://doi.org/10.3390/e26070576