Nothing Special   »   [go: up one dir, main page]

Skip to main content

Reconstructing the Unseen: GRIOT for Attributed Graph Imputation with Optimal Transport

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases. Research Track (ECML PKDD 2024)

Abstract

In recent years, there has been a significant surge in machine learning techniques, particularly in the domain of deep learning, tailored for handling attributed graphs. Nevertheless, to work, these methods assume that the attributes values are fully known, which is not realistic in numerous real-world applications. This paper explores the potential of Optimal Transport (OT) to impute missing attributes on graphs. To proceed, we design a novel multi-view OT loss function that can encompass both node feature data and the underlying topological structure of the graph by utilizing multiple graph representations. We then utilize this novel loss to train efficiently a Graph Convolutional Neural Network (GCN) architecture capable of imputing all missing values over the graph at once. We evaluate the interest of our approach with experiments both on synthetic data and real-world graphs, including different missingness mechanisms and a wide range of missing data. These experiments demonstrate that our method is competitive with the state-of-the-art in all cases and of particular interest on weakly homophilic graphs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Code and additional material available at: github.com/RichardSrn/GRIOT.

References

  1. Berg, R.V.D., Kipf, T.N., Welling, M.: Graph convolutional matrix completion. arXiv preprint arXiv:1706.02263 (2017)

  2. Brogat-Motte, L., Flamary, R., Brouard, C., Rousu, J., d’Alché-Buc, F.: Learning to predict graphs with Fused Gromov-Wasserstein barycenters. In: ICML, pp. 2321–2335 (2022)

    Google Scholar 

  3. Chen, X., Chen, S., Yao, J., Zheng, H., Zhang, Y., Tsang, I.W.: Learning on attribute-missing graphs. IEEE PAMI 44(2), 740–757 (2022)

    Article  Google Scholar 

  4. Chung, F.R.: Spectral Graph Theory, vol. 92. American Mathematical Society (1997)

    Google Scholar 

  5. Craven, M., et al.: Learning to extract symbolic knowledge from the world wide web. AAAI/IAAI 3(3.6), 2 (1998)

    Google Scholar 

  6. Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: Advances in Neural Information Processing Systems, vol. 26 (2013)

    Google Scholar 

  7. Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in Neural Information Processing Systems, vol. 29 (2016)

    Google Scholar 

  8. Dwivedi, V.P., Bresson, X.: A generalization of transformer networks to graphs. CoRR abs/2012.09699 (2020). https://arxiv.org/abs/2012.09699

  9. Flamary, R., et al.: POT: python optimal transport. J. Mach. Learn. Res. 22(78), 1–8 (2021)

    Google Scholar 

  10. Huisman, M.: Imputation of missing network data: some simple procedures. In: Encyclopedia of Social Network Analysis and Mining, pp. 707–715 (2014)

    Google Scholar 

  11. Jiang, B., Zhang, Z.: Incomplete graph representation and learning via partial graph neural networks. arXiv preprint arXiv:2003.10130 (2020)

  12. Kantorovich, L.V.: Mathematical methods of organizing and planning production. Manage. Sci. 6(4), 366–422 (1960)

    Article  MathSciNet  Google Scholar 

  13. Kerdoncuff, T., Emonet, R., Perrot, M., Sebban, M.: Optimal tensor transport. In: AAAI Conference on Artificial Intelligence, vol. 36, pp. 7124–7132 (2022)

    Google Scholar 

  14. Khanam, K.Z., Srivastava, G., Mago, V.: The homophily principle in social network analysis: a survey. Multimedia Tools Appl. 82(6), 8811–8854 (2023)

    Article  Google Scholar 

  15. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  16. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (2017)

    Google Scholar 

  17. Kossinets, G.: Effects of missing data in social networks. Soc. Netw. 247–268 (2006)

    Google Scholar 

  18. Laclau, C., Redko, I., Choudhary, M., Largeron, C.: All of the fairness for edge prediction with optimal transport. In: AISTATS, vol. 130, pp. 1774–1782 (2021)

    Google Scholar 

  19. Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data, vol. 793. Wiley, Hoboken (2019)

    Google Scholar 

  20. McPherson, M., Smith-Lovin, L., Cook, J.M.: Birds of a feather: homophily in social networks. Ann. Rev. Sociol. 27(1), 415–444 (2001)

    Article  Google Scholar 

  21. Mensch, A., Blondel, M., Peyré, G.: Geometric losses for distributional learning. In: International Conference on Machine Learning, pp. 4516–4525 (2019)

    Google Scholar 

  22. Monti, F., Bronstein, M., Bresson, X.: Geometric matrix completion with recurrent multi-graph neural networks. In: NIPS, vol. 30 (2017)

    Google Scholar 

  23. Muzellec, B., Josse, J., Boyer, C., Cuturi, M.: Missing data imputation using optimal transport. In: ICML, pp. 7130–7140 (2020)

    Google Scholar 

  24. Rossi, E., Kenlay, H., Gorinova, M.I., Chamberlain, B.P., Dong, X., Bronstein, M.M.: On the unreasonable effectiveness of feature propagation in learning on graphs with missing node features. In: Learning on Graphs Conference (2022)

    Google Scholar 

  25. Schafer, J.L., Graham, J.W.: Missing data: our view of the state of the art. Psychol. Methods 7(2), 147 (2002)

    Article  Google Scholar 

  26. Stekhoven, D.J., Bühlmann, P.: MissForest-non-parametric missing value imputation for mixed-type data. Bioinformatics 28(1), 112–118 (2011)

    Article  Google Scholar 

  27. Taguchi, H., Liu, X., Murata, T.: Graph convolutional networks for graphs containing missing features. Futur. Gener. Comput. Syst. 117, 155–168 (2021)

    Article  Google Scholar 

  28. Van Buuren, S., Groothuis-Oudshoorn, K.: Mice: multivariate imputation by chained equations in R. J. Stat. Softw. 45, 1–67 (2011)

    Article  Google Scholar 

  29. Vayer, T., Chapel, L., Flamary, R., Tavenard, R., Courty, N.: Fused Gromov-Wasserstein distance for structured objects: theoretical foundations and mathematical properties. Algorithms 13(9), 212 (2020)

    Article  Google Scholar 

  30. Villani, C., et al.: Optimal Transport: Old and New, vol. 338. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-540-71050-9

    Book  Google Scholar 

  31. Yang, Z., Cohen, W., Salakhudinov, R.: Revisiting semi-supervised learning with graph embeddings. In: ICML, pp. 40–48 (2016)

    Google Scholar 

  32. Yoon, J., Jordon, J., van der Schaar, M.: GAIN: missing data imputation using generative adversarial nets. In: Proceedings of ICML, vol. 80, pp. 5689–5698 (2018)

    Google Scholar 

  33. Zhang, J., Xiao, X., Huang, L.K., Rong, Y., Bian, Y.: Fine-tuning graph neural networks via graph topology induced optimal transport. In: IJCAI, pp. 3730–3736 (2022)

    Google Scholar 

  34. Zheng, X., Liu, Y., Pan, S., Zhang, M., Jin, D., Yu, P.S.: Graph neural networks for graphs with heterophily: a survey. arXiv preprint arXiv:2202.07082 (2022)

Download references

Acknowledgment

This work has been funded by a public grant from the French National Research Agency (ANR) under the “France 2030” investment plan, which has the reference EUR MANUTECH SLEIGHT - ANR-17-EURE-0026.

This work is openly licensed via CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richard Serrano .

Editor information

Editors and Affiliations

Ethics declarations

Ethical Statement

Datasets used in this article are publicly accessible, and the code is public, making the results reproducible (Code and additional material available at: github.com/RichardSrn/GRIOT). This paper presents work in machine learning, a field that can have potential societal consequences, but none of which we feel must be specifically highlighted here.

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 510 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Serrano, R., Laclau, C., Jeudy, B., Largeron, C. (2024). Reconstructing the Unseen: GRIOT for Attributed Graph Imputation with Optimal Transport. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14946. Springer, Cham. https://doi.org/10.1007/978-3-031-70365-2_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70365-2_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70364-5

  • Online ISBN: 978-3-031-70365-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics