research-article

Public Access

Kernel Ridge Regression-Based Graph Dataset Distillation

Authors:

Zhe Xu,

Hanghang TongAuthors Info & Claims

KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 2850 - 2861

https://doi.org/10.1145/3580305.3599398

Published: 04 August 2023 Publication History

PDF eReader

Abstract

The huge volume of emerging graph datasets has become a double-bladed sword for graph machine learning. On the one hand, it empowers the success of a myriad of graph neural networks (GNNs) with strong empirical performance. On the other hand, training modern graph neural networks on huge graph data is computationally expensive. How to distill the given graph dataset while retaining most of the trained models' performance is a challenging problem. Existing efforts try to approach this problem by solving meta-learning-based bilevel optimization objectives. A major hurdle lies in that the exact solutions of these methods are computationally intensive and thus, most, if not all, of them are solved by approximate strategies which in turn hurt the distillation performance. In this paper, inspired by the recent advances in neural network kernel methods, we adopt a kernel ridge regression-based meta-learning objective which has a feasible exact solution. However, the computation of graph neural tangent kernel is very expensive, especially in the context of dataset distillation. As a response, we design a graph kernel, named LiteGNTK, tailored for the dataset distillation problem which is closely related to the classic random walk graph kernel. An effective model named Kernel rıdge regression-based graph Dataset Distillation (KIDD) and its variants are proposed. KIDD shows nice efficiency in both the forward and backward propagation processes. At the same time, KIDD shows strong empirical performance over 7 real-world datasets compared with the state-of-the-art distillation methods. Thanks to the ability to find the exact solution of the distillation objective, the learned training graphs by KIDD can sometimes even outperform the original whole training set with as few as 1.65% training graphs.

Supplementary Material

MP4 File (rtfp0477-2min-promo.mp4)

The promotion video of the KDD 2023 paper "Kernel Ridge Regression-Based Graph Dataset Distillation"

Download
3.29 MB

References

[1]

Abubakar Abid, Muhammad Fatih Balin, and James Zou. 2019. Concrete autoencoders for differentiable feature selection and reconstruction. arXiv preprint arXiv:1901.09346 (2019).

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Fair graph distillation

Copyright-certified distillation dataset: distilling one million coins into one bitcoin with your private key

Graph neural tangent kernel: convergence on large graphs

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations