Hypergraph co-optimal transport: metric and categorical properties

Samir Chowdhury¹,
Tom Needham²,
Ethan Semrad²,
Bei Wang³ &
…
Youjia Zhou³

359 Accesses
3 Citations
Explore all metrics

Abstract

Hypergraphs capture multi-way relationships in data, and they have consequently seen a number of applications in higher-order network analysis, computer vision, geometry processing, and machine learning. In this paper, we develop theoretical foundations for studying the space of hypergraphs using ingredients from optimal transport. By enriching a hypergraph with probability measures on its nodes and hyperedges, as well as relational information capturing local and global structures, we obtain a general and robust framework for studying the collection of all hypergraphs. First, we introduce a hypergraph distance based on the co-optimal transport framework of Redko et al. and study its theoretical properties. Second, we formalize common methods for transforming a hypergraph into a graph as maps between the space of hypergraphs and the space of graphs, and study their functorial properties and Lipschitz bounds. Finally, we demonstrate the versatility of our Hypergraph Co-Optimal Transport (HyperCOT) framework through various examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Ollivier Ricci curvature of directed hypergraphs

Article Open access 27 July 2020

Hypergraph clustering by iteratively reweighted modularity maximization

Article Open access 20 August 2020

Cut Vertex Transit Functions of Hypergraphs

Notes

Hypergraph Co-Optimal Transport: https://github.com/samirchowdhury/HyperCOT.
https://github.com/PythonOT/COOT.
https://pythonot.github.io/.
Hypergraph Co-Optimal Transport: https://github.com/samirchowdhury/HyperCOT.

References

Agarwal, S., Lim, J., Zelnik-Manor, L., Perona, P., Kriegman, D., Belongie, S.: Beyond pairwise clustering. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 838–845 (2005)
Ambrosio, L., Gigli, N., Savaré, G.: Gradient Flows: In Metric Spaces and in the Space of Probability Measures. Birkhäuser Verlag, Basel (2005)
Google Scholar
Banerjee, A., Char, A., Mondal, B.: Spectra of general hypergraphs. Linear Algebra Appl. 1, 14–30 (2017)
MathSciNet Google Scholar
Bermond, J.-C., Heydemann, M.-C., Sotteau, D.: Line graphs of hypergraphs I. Discrete Math. 18(3), 235–241 (1977)
MathSciNet Google Scholar
Blumberg, A.J., Carriere, M., Mandell, M.A., Rabadan, R., Villar, S.: MREC: a fast and versatile framework for aligning and matching point clouds with applications to single cell molecular data. arXiv preprint arXiv:2001.01666 (2020)
Bochner, S.: Harmonic Analysis and the Theory of Probability. University of California Press, Berkeley (2020)
Google Scholar
Borceux, F.: Handbook of Categorical Algebra: Basic Category Theory, vol. 1. Cambridge University Press, Cambridge (1994)
Google Scholar
Bretto, A.: Hypergraph Theory: An Introduction. Springer, New York (2013)
Google Scholar
Bronstein, A.M., Bronstein, M.M., Kimmel, R.: Numerical Geometry of Non-rigid Shapes. Springer, New York (2008)
Google Scholar
Carlsson, G., Mémoli, F.: Persistent clustering and a theorem of J. Kleinberg. arXiv preprint arXiv:0808.2241 (2008)
Carlsson, G., Mémoli, F.: Characterization, stability and convergence of hierarchical clustering methods. J. Mach. Learn. Res. 11, 1425–1470 (2010)
MathSciNet Google Scholar
Carlsson, G., Mémoli, F.: Classifying clustering schemes. Found. Comput. Math. 13(2), 221–252 (2013)
MathSciNet Google Scholar
Carlsson, G., Mémoli, F., Ribeiro, A., Segarra, S.: Axiomatic construction of hierarchical clustering in asymmetric networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5219–5223 (2013)
Carlsson, G., Mémoli, F., Ribeiro, A., Segarra, S.: Hierarchical clustering of asymmetric networks. Adv. Data Anal. Classif. 12(1), 65–105 (2018)
MathSciNet Google Scholar
Carlsson, G., Mémoli, F., Segarra, S.: Robust hierarchical clustering for directed networks: an axiomatic approach. SIAM J. Appl. Algebra Geom. 5(4), 675–700 (2021)
MathSciNet Google Scholar
Cencetti, G., Battiston, F., Lepri, B., Karsai, M.: Temporal properties of higher-order interactions in social networks. Sci. Rep. 11, 7028 (2021)
Google Scholar
Chertok, M., Keller, Y.: Efficient high order matching. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2205–2215 (2010)
Google Scholar
Chowdhury, S., Miller, D., Needham, T.: Quantized Gromov–Wasserstein. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 811–827 (2021)
Chowdhury, S., Mémoli, F.: Explicit geodesics in Gromov–Hausdorff space. Electron. Res. Announc. Math. Sci. 25, 48 (2018)
MathSciNet Google Scholar
Chowdhury, S., Mémoli, F.: The Gromov–Wasserstein distance between networks and stable network invariants. Inf. Inference J. IMA 8(4), 757–787 (2019)
MathSciNet Google Scholar
Chowdhury, S., Needham, T.: Gromov–Wasserstein averaging in a Riemannian framework. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 842–843 (2020)
Chowdhury, S., Needham, T.: Generalized spectral clustering via Gromov–Wasserstein learning. In: International Conference on Artificial Intelligence and Statistics, pp. 712–720 (2021)
Culbertson, J., Guralnik, D.P., Hansen, J., Stiller, P.F.: Consistency constraints for overlapping data clustering. arXiv preprint arXiv:1608.04331 (2016)
Culbertson, J., Guralnik, D.P., Stiller, P.F.: Functorial hierarchical clustering with overlaps. Discrete Appl. Math. 236, 108–123 (2018)
MathSciNet Google Scholar
Dey, T.K., Mémoli, F., Wang, Y.: Multiscale mapper: topological summarization via codomain covers. In: Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 997–1013 (2016)
Dörfler, W., Waller, D.: A category-theoretical approach to hypergraphs. Arch. Math. 34(1), 185–192 (1980)
MathSciNet Google Scholar
Duchenne, O., Bach, F., Kweon, I.-S., Ponce, J.: A tensor-based algorithm for high-order graph matching. IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2383–2395 (2011)
Google Scholar
Flamary, R., Courty, N.: POT: Python Optimal Transport library (2017)
Flamary, R., Courty, N., Gramfort, A., Alaya, M.Z., Boisbunon, A., Chambon, S., Chapel, L., Corenflos, A., Fatras, K., Fournier, N., Gautheron, L., Gayraud, N.T.H., Janati, H., Rakotomamonjy, A., Redko, I., Rolet, A., Schutz, A., Seguy, V., Sutherland, D.J., Tavenard, R., Tong, A., Vayer, T.: Pot: Python optimal transport. J. Mach. Learn. Res. 22(78), 1–8 (2021)
Google Scholar
Folland, G.B.: Real Analysis: Modern Techniques and Their Applications. Wiley, Hoboken (1999)
Google Scholar
Gallo, G., Ülkücü, A.: Bilinear programming: an exact algorithm. Math. Program. 12(1), 173–194 (1977)
MathSciNet Google Scholar
Gold, S., Rangarajan, A.: A graduated assignment algorithm for graph matching. IEEE Trans. Pattern Anal. Mach. Intell. 18(4), 377–388 (1996)
Google Scholar
Hell, P., Nesetril, J.: Graphs and Homomorphisms, vol. 28. OUP, Oxford (2004)
Google Scholar
Hendrikson, R.: Using Gromov–Wasserstein distance to explore sets of networks. Master’s thesis, University of Tartu (2016)
Isbell, J.R.: Six theorems about injective metric spaces. Commentarii Mathematici Helvetici 39(1), 65–76 (1964)
MathSciNet Google Scholar
Karoński, M., Palka, Z.: On Marczewski–Steinhaus type distance between hypergraphs. Applicationes Mathematicae 16(1), 47–57 (1977)
MathSciNet Google Scholar
Klamt, S., Haus, U.-U., Theis, F.: Hypergraphs and cellular networks. PLoS Comput. Biol. 5, e1000385 (2009)
MathSciNet Google Scholar
Knuth, D.E.: The Stanford GraphBase: A Platform for Combinatorial Computing. ACM Press, New York (1994)
Google Scholar
Konstantinova, E.V., Skorobogatov, V.A.: Application of hypergraph theory in chemistry. Discrete Math. 235(1–3), 365–383 (2001)
MathSciNet Google Scholar
Kountouras, A., Kintis, P., Lever, C., Chen, Y., Nadji, Y., Dagon, D., Antonakakis, M., Joffe, R.: Enabling network security through active DNS datasets. In: International Symposium on Research in Attacks, Intrusions, and Defenses, pp. 188–208 (2016)
Lawvere, F.W.: The category of probabilistic mappings. Preprint (1962)
Lee, J., Cho, M., Lee, K.M.: Hyper-graph matching via reweighted random walks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011)
Li, M., Palande, S., Yan, L., Wang, B.: Sketching merge trees for scientific data visualization. arXiv preprint arXiv:2101.03196 (2021)
Lin, C.-L.: Hardness of approximating graph transformation problem. In: Algorithms and Computation: 5th International Symposium, pp. 74–82 (1994)
Lotito, Q.F., Musciotto, F., Montresor, A., Battiston, F.: Higher-order motif analysis in hypergraphs. Commun. Phys. 5, 79 (2022)
Google Scholar
Mémoli, F.: On the use of Gromov-Hausdorff distances for shape comparison. In: Eurographics Symposium on Point-Based Graphics, pp. 81–90 (2007)
Mémoli, F.: Gromov–Wasserstein distances and the metric approach to object matching. Found. Comput. Math. 11(4), 417–487 (2011a)
Mémoli, F.: A spectral notion of Gromov–Wasserstein distance and related methods. Appl. Comput. Harmon. Anal. 30(3), 363–401 (2011b)
Mutlu, A., Gürdal, U.: Bipolar metric spaces and some fixed point theorems. J. Nonlinear Sci. Appl. 9(9), 5362–5373 (2016)
MathSciNet Google Scholar
Patania, A., Petri, G., Vaccarino, F.: The shape of collaborations. PJ Data Sci. 6, 18 (2017)
Google Scholar
Perrone, P.: Lifting couplings in wasserstein spaces. arXiv preprint arXiv: 2110.06591 (2021)
Peyré, G., Cuturi, M., Solomon, J.: Gromov–Wasserstein averaging of kernel and distance matrices. In: International Conference on Machine Learning, pp. 2664–2672 (2016)
Peyré, G., Cuturi, M., et al.: Computational optimal transport: with applications to data science. Found. Trends Mach. Learn. 11(5–6), 355–607 (2019)
Google Scholar
Pu, L., Faltings, B.: Hypergraph learning with hyperedge expansion. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 410–425 (2012)
Redko, I., Vayer, T., Flamary, R., Courty, N.: Co-optimal transport. In: Advances in Neural Information Processing Systems, vol. 33, pp. 17559–17570 (2020)
Reuter, M., Wolter, F.-E., Peinecke, N.: Laplace-Beltrami spectra as ‘Shape-DNA’ of surfaces and solids. Comput. Aided Des. 38(4), 342–366 (2006)
Google Scholar
Rodri’guez, J.A.: On the Laplacian eigenvalues and metric parameters of hypergraphs. Linear Multilinear Algebra 50(1), 1–14 (2002)
MathSciNet Google Scholar
Scetbon, M., Peyré, G., Cuturi, M.: Linear-time gromov wasserstein distances using low rank couplings and costs. In: International Conference on Machine Learning, pp. 19347–19365 (2022)
Schmidt, G., Ströhlein, T.: Relations and Graphs: Discrete Mathematics for Computer Scientists. Springer, Berlin (2012)
Google Scholar
Singh, G., Mémoli, F., Carlsson, G.E.: Topological methods for the analysis of high dimensional data sets and 3d object recognition. In: The IEEE Eurographics Symposium on Point-Based Graphics (PBG), vol. 2, pp. 091–100 (2007)
Smaniotto, S., Pelillo, M.: Two metrics for attributed hypergraphs. Pattern Recogn. Lett. 149, 143–149 (2021)
Google Scholar
Solomon, J.: Optimal transport on discrete domains. AMS Short Course on Discrete Differential Geometry (2018)
Solomon, J., Peyré, G., Kim, V.G., Sra, S.: Entropic metric alignment for correspondence problems. ACM Trans. Graph. (TOG) 35(4), 1–13 (2016)
Google Scholar
Sturm, K.-T.: On the geometry of metric measure spaces. Acta Math. 196(1), 65–131 (2006)
MathSciNet Google Scholar
Sturm, K.-T.: The space of spaces: curvature bounds and gradient flows on the space of metric measure spaces. arXiv preprint arXiv:1208.0434 (2012)
Sun, L., Ji, S., Ye, J.: Hypergraph spectral learning for multi-label classification. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 668–676 (2008)
Surana, A., Chen, C., Rajapakse, I.: Hypergraph dissimilarity measures. arXiv preprint arXiv:2106.08206 (2021)
Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: ArnetMiner: extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 990–998 (2008)
Ulam, S.M.: Some ideas and prospects in biomathematics. Annu. Rev. Biophys. Bioeng. 1, 227–292 (1972)
Google Scholar
Umeyama, S.: An eigen decomposition approach to weighted graph matching problems. IEEE Trans. Pattern Anal. Mach. Intell. 10(5), 695–703 (1988)
Google Scholar
Vayer, T., Courty, N., Tavenard, R., Flamary, R.: Optimal transport for structured data with application on graphs. In: International Conference on Machine Learning, pp. 6275–6284 (2019)
Villani, C.: Topics in Optimal Transportation, vol. 58. American Mathematical Society, Providence, Rhode Island (2003)
Google Scholar
Xia, X., Yin, H., Yu, J., Wang, Q., Cui, L., Zhang, X.: Self-supervised hypergraph convolutional networks for session-based recommendation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 4503–4511 (2021)
Xu, H., Luo, D., Carin, L.: Scalable Gromov–Wasserstein learning for graph partitioning and matching. In: Advances in Neural Information Processing Systems, pp. 3046–3056 (2019a)
Xu, H., Luo, D., Zha, H., Carin, L.: Gromov–Wasserstein learning for graph matching and node embedding. In: International Conference on Machine Learning, pp. 6932–6941 (2019b)
Xu, H.: Gromov–Wasserstein factorization models for graph clustering. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 6478–6485 (2020)
Zeng, Z., Tung, A.K.H., Wang, J., Feng, J., Zhou, L.: Comparing stars: on approximating graph edit distance. Proc. VLDB Endow. 2(1), 25–36 (2009)
Google Scholar
Zhou, Y., Rathore, A., Purvine, E., Wang, B.: Topological simplifications of hypergraphs. IEEE Trans. Visual Comput. Graphics 29(7), 3209–3225 (2023). https://doi.org/10.1109/TVCG.2022.3153895
Zien, J., Schlag, M., Chan, P.K.: Multi-level spectral hypergraph partitioning with arbitrary vertex sizes. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 18, 1389–1399 (1999)
Google Scholar

Download references

Acknowledgements

We would like to thank the anonymous referees for numerous useful suggestions. This work was partially supported by NSF DMS 2107808, NSF IIS 1910733, NSF IIS 214549 and DOE DE-SC0021015.

Author information

Authors and Affiliations

Stanford University, Stanford, CA, 94304, USA
Samir Chowdhury
Florida State University, Tallahassee, FL, 32306, USA
Tom Needham & Ethan Semrad
University of Utah, Salt Lake City, UT, 610101, USA
Bei Wang & Youjia Zhou

Authors

Samir Chowdhury
View author publications
You can also search for this author in PubMed Google Scholar
Tom Needham
View author publications
You can also search for this author in PubMed Google Scholar
Ethan Semrad
View author publications
You can also search for this author in PubMed Google Scholar
Bei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Youjia Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tom Needham.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 Proofs from Sect. 2.3

To prove Theorem 1, we begin with a technical result.

Lemma 24

The infimum in the definition of $d_{\mathcal {H},p}$ is realized.

Proof

For a Polish space X, let $\textrm{Prob}(X)$ denote the set of Borel probability measures over X, endowed with the weak topology.

Let $H = (X,\mu ,Y,\nu ,\omega ), H' = (X',\mu ',Y',\nu ',\omega ') \in \mathcal {H}$. As X and $X'$ are Polish spaces and $\mu $ and $\mu '$ are Borel probability measures, we have that $\mathcal {C}(\mu ,\mu ')$ is compact in $\textrm{Prob}(X\times X')$, by Chowdhury and Mémoli (2019, Lemma 10). It follows that $\mathcal {C}(\mu ,\mu ') \times \mathcal {C}(\nu ,\nu ')$ is compact in $\textrm{Prob}(X\times X') \times \textrm{Prob}(Y \times Y')$.

Now we will show that the distortion functional $\textrm{dis}_p^{\mathcal {H}} = \textrm{dis}_{H,H',p}^\mathcal {H}$ is continuous on $\mathcal {C}(\mu ,\mu ') \times \mathcal {C}(\nu ,\nu ')$, beginning with the $1\le p < \infty $ case. Since we are working in Polish spaces with finite measures, we have that bounded continuous functions are dense. So for every $n\in \mathbb {N}$, we can choose continuous bounded functions $\omega _n\in L^p(\mu \otimes \nu )$ and $\omega '_n\in L^p(\mu '\otimes \nu ')$ such that the following holds for all $x\in X, y\in Y, x'\in X',$ and $ y'\in Y'$:

$$\begin{aligned} \Vert \omega (x,y)-\omega _n(x,y)\Vert _{L^p(\mu \otimes \nu )} \le \frac{1}{n} \quad \text{ and } \quad \Vert \omega '(x',y')-\omega '_n(x',y')\Vert _{L^p(\mu '\otimes \nu ')} \le \frac{1}{n}. \end{aligned}$$

Then for each $n\in \mathbb {N}$ we define a n-distortion functional $\textrm{dis}_p^n: \mathcal {C}(\mu ,\mu ')\times \mathcal {C}(\nu ,\nu ')\longrightarrow \mathbb {R}_+$ as $\textrm{dis}_p^n(\pi ,\xi ):= \Vert \omega _n(x,y) - \omega '_n(x',y') \Vert _{L^p(\pi \otimes \xi )}$

Now let us show that the n-distortion functional is continuous. Let $(\pi ,\xi )\in \mathcal {C}(\mu ,\mu ')\times \mathcal {C}(\nu ,\nu ')$ and let $(\pi _m)_{m\in \mathbb {N}}$ and $(\xi _m)_{m\in \mathbb {N}}$ be sequences in $\mathcal {C}(\mu ,\mu ')$ and $\mathcal {C}(\nu ,\nu ')$ that converge to $\pi $ and $\xi $ with respect to the weak topology. Then

$$\begin{aligned}&\lim _{m\rightarrow \infty }\text {dis}_p^n(\pi _m,\xi _m) \end{aligned}$$

(29)

$$\begin{aligned}&\quad = \lim _{m\rightarrow \infty }\left( \int _{Y\times Y'}\int _{X\times X'} |\omega _n(x,y) - \omega '_n(x',y') |^p \pi _m (dx \times dx')\xi _m(dy \times dy')\right) ^{\frac{1}{p}} \end{aligned}$$

(30)

$$\begin{aligned}&\quad = \left( \int _{Y\times Y'}\int _{X\times X'} |\omega _n(x,y) - \omega '_n(x',y') |^p\pi (dx \times dx')\xi (dy \times dy')\right) ^{\frac{1}{p}} \nonumber \\&\quad = \text {dis}_p^n(\pi ,\xi ). \end{aligned}$$

(31)

So we have that $\text {dis}_p^n$ is sequentially continuous. As $\mathcal {C}(\mu ,\mu ')\times \mathcal {C}(\nu ,\nu ') \subseteq $ Prob$(X\times X'\times Y\times Y')$, it is a metrizable space (Ambrosio et al. (2005), Remark 5.1.1). Therefore, as $\text {dis}_p^n$ is sequentially continuous in a metrizable space, it is also continuous.

Now let us see that $\text {dis}_p^n$ converges to $\text {dis}^\mathcal {H}_p$ uniformly. Let $(\pi ,\xi )\in \mathcal {C}(\mu ,\mu ')\times \mathcal {C}(\nu ,\nu ')$ and observe

$$\begin{aligned}&|\text {dis}_p^{\mathcal {H}}(\pi ,\xi ) - \text {dis}_p^n(\pi ,\xi ) |\\&\quad = |\Vert \omega (x,y) - \omega '(x',y') \Vert _{L^p(\pi \otimes \xi )}-\Vert \omega _n(x,y) - \omega '_n(x',y') \Vert _{L^p(\pi \otimes \xi )} |\\&\quad \le \Vert \omega (x,y) - \omega '(x',y') - \omega _n(x,y) + \omega '_n(x',y') \Vert _{L^p(\pi \otimes \xi )} \\&\quad \le \Vert \omega (x,y)-\omega _n(x,y)\Vert _{L^p(\mu \otimes \nu )} +\Vert \omega '(x',y')-\omega '_n(x',y')\Vert _{L^p(\mu '\otimes \nu ')} \le \frac{2}{n}. \end{aligned}$$

Since $\textrm{dis}_p^{\mathcal {H}}$ is the uniform limit of continuous functions, it is also continuous. It follows that the infimum in (4) is realized with a minimal coupling.

For the case in which $p=\infty $, let $(\pi ,\xi )$ be couplings in $\mathcal {C}(\mu ,\mu ')\times \mathcal {C}(\nu ,\nu ')$. As the domain of $\textrm{dis}_p$, $\mathcal {C}(\mu ,\mu ')\times \mathcal {C}(\nu ,\nu ')$, is a subset of Prob$(X\times X'\times Y\times Y')$, Jensen’s Inequality shows that if $1\le p\le q\le \infty $, then we have that $\textrm{dis}_p^{\mathcal {H}} \le \textrm{dis}_q^{\mathcal {H}}$. Also note that $\lim _{p\rightarrow \infty } \textrm{dis}_p^{\mathcal {H}}(\pi ,\xi ) =\textrm{dis}_\infty ^{\mathcal {H}}(\pi ,\xi )$. Then as for $p\in [1,\infty ), \textrm{dis}_p^{\mathcal {H}}$ is continuous, and $\textrm{dis}_{\infty }^{\mathcal {H}} = \sup \{\textrm{dis}_p^{\mathcal {H}} \, \vert \, p \in [1,\infty ) \}$ we have that $\textrm{dis}_\infty ^{\mathcal {H}}$ is lower semicontinuous. As $\textrm{dis}_p^\mathcal {H}$ is at least lower semicontinuous over a compact domain, we have that the infimum in (4) is realized with a minimal coupling for any $p\in [1,\infty ]$. $\square $

Proof of Theorem 1

In the following, let $H=(X,\mu ,Y,\nu ,\omega )$, $H'=(X',\mu ',Y',\nu ',\omega ')$ and $H'' = (X'',\mu '',Y'',\nu '',\omega '')$ be measure hypernetworks.

Pseudometric. Symmetry is obvious. For the triangle inequality, let $(\pi _{12},\xi _{12})$ and $(\pi _{23},\xi _{23})$ be couplings realizing $d_{\mathcal {H},p}(H,H')$ and $d_{\mathcal {H},p}(H',H'')$, respectively (see Lemma 24). Then, using the gluing lemma (see, e.g., (Sturm 2012, Lemma 1.4)), we construct a probability measure $\pi $ on $X \times X' \times X''$ such that the $X \times X'$ marginal of $\pi $ is $\pi _{12}$, the $X' \times X''$ marginal is $\pi _{23}$ and the $X \times X''$ marginal is a coupling of $\mu $ and $\mu ''$, which we denote $\pi _{13}$. Likewise, we construct a probability measure $\xi $ on $Y \times Y' \times Y''$ with analogous properties—in particular, the $Y \times Y''$ marginal is denoted $\xi _{13}$. Then, using the various marginal conditions, we have

$$\begin{aligned}&d_{\mathcal {H},p}(H,H'') \\&\quad \le \textrm{dis}_p^\mathcal {H}(\pi _{13},\xi _{13}) \\&\quad =\Vert \omega -\omega ''\Vert _{L^p(\pi \otimes \xi )}\\&\quad \le \Vert \omega -\omega '\Vert _{L^p(\pi \otimes \xi )}+\Vert \omega '-\omega ''\Vert _{L^p(\pi \otimes \xi )}\\&\quad =\Vert \omega -\omega '\Vert _{L^p(\pi _{12}\otimes \xi _{12})}+\Vert \omega '-\omega ''\Vert _{L^p(\pi _{23}\otimes \xi _{23})} = d_{\mathcal {H},p}(H,H')+d_{\mathcal {H},p}(H',H''). \end{aligned}$$

Complete. We prove completeness by adapting the proof of Sturm (2012, Theorem 5.8). Let $H_n = (X_n,\mu _n,Y_n,\nu _n,\omega _n)$ be a Cauchy sequence of measure hypernetworks. Then we can choose couplings $\pi _n \in \mathcal {C}(\mu _{n},\mu _{n+1})$ and $\xi _n \in \mathcal {C}(\nu _n,\nu _{n+1})$ such that $\textrm{dis}_p^{\mathcal {H}}(\pi _n,\xi _n) \rightarrow 0$. We then apply the generalized Gluing Lemma (Sturm 2012, Lemma 1.4) to $\pi _1,\ldots ,\pi _{N-1}$ (respectively, to $\xi _1,\ldots ,\xi _{N-1}$) to construct a measure $\Pi _N$ on $\prod _{n=1}^N X_n$ (respectively, $\Xi _N$ on $\prod _{n=1}^N Y_N$). We now define $\overline{\mu }$ to be the projective limit $\varprojlim \Pi _N$, a well-defined probability measure on $\prod _{n=1}^\infty X_n$ (e.g., Bochner 2020). Set $\overline{X}:= \textrm{supp}(\overline{\mu })$; then $\overline{X}$ is a closed subspace of a countable product of Polish spaces and is therefore Polish. Similarly, define $\overline{\nu }:= \varprojlim \Xi _N$ and $\overline{Y}:= \textrm{supp}(\overline{\nu })$. Next we define $\Omega _N: \overline{X} \times \overline{Y} \rightarrow {\mathbb {R}}$ by

$$\begin{aligned} \Omega _N((x_i)_{i=1}^\infty ,(y_i)_{i=1}^\infty ) = \omega _N(x_N,y_N). \end{aligned}$$

The sequence $(\Omega _N)_{N=1}^\infty $ is a Cauchy sequence in the Banach space $L^p(\overline{X} \times \overline{Y}, \overline{\mu } \otimes \overline{\nu })$, since

$$\begin{aligned} \Vert \Omega _N - \Omega _{N+1}\Vert _{L^p(\overline{\mu } \otimes \overline{\nu })} = \textrm{dis}_p^{\mathcal {H}}(\pi _N,\xi _N) \rightarrow 0, \end{aligned}$$

and we define $\overline{\omega }$ to be its limit. Each $\omega _N$ is essentially bounded, so it follows that $\overline{\omega }$ is as well. Setting $\overline{H}:= (\overline{X},\overline{\mu },\overline{Y},\overline{\nu },\overline{\omega })$, we have $\overline{H} \in \mathcal {H}$ and

$$\begin{aligned} d_{\mathcal {H},p}(H_n,\overline{H}) \le \Vert \omega _n - \overline{\omega }\Vert _{L^p(\overline{\mu } \otimes \overline{\nu })} \rightarrow 0. \end{aligned}$$

Geodesic. Finally, we show that the metric on $\mathcal {H}$ modulo weak isomorphism is geodesic. For $H \in \mathcal {H}$, let [H] denote its weak isomorphism class. We will construct an explicit geodesic between any two weak isomorphism classes of hypernetworks [H] and $[H']$, following the main idea of the proof of Sturm (2012, Theorem 3.1). Let $\pi \in \mathcal {C}(\mu ,\mu ')$ and $\xi \in \mathcal {C}(\nu ,\nu ')$ be optimal couplings (which exist, by Lemma 24). Consider the path $\gamma :[0,1] \rightarrow \mathcal {H}$ defined by

$$\begin{aligned} \gamma (t) = \left( \textrm{supp}(\pi ), \pi , \textrm{supp}(\xi ), \xi , \omega _t\right) , \end{aligned}$$

(32)

where

$$\begin{aligned} \omega _t((x,x'),(y,y')) = (1-t)\omega (x,y) + t \omega '(x',y'). \end{aligned}$$

Observe that $\gamma (0) \in [H]$ and $\gamma (1) \in [H']$; indeed, coordinate projection maps $\phi :X \times X' \rightarrow X$ and $\psi :Y \times Y' \rightarrow Y$ define a basic weak isomorphism $\gamma (0) \rightarrow H$ and a similar construction works for $\gamma (1) \rightarrow H'$. To show that $\gamma $ defines a geodesic in $[\mathcal {H}]$, it suffices to show that for all $0 \le s \le t \le 1$,

$$\begin{aligned} d_{\mathcal {H},p}(\gamma (s),\gamma (t)) \le (t-s) d_{\mathcal {H},p}(H,H') \end{aligned}$$

(33)

(see, e.g., (Chowdhury and Mémoli 2018, Lemma 1.3)). One can show by an elementary computation that

$$\begin{aligned} \textrm{dis}_{\gamma (s),\gamma (t),p}^{\mathcal {H}}(\pi ,\xi )^p \le (t-s)^p \textrm{dis}_{H,H',p}^{\mathcal {H}}(\pi ,\xi )^p \end{aligned}$$

for any pair of couplings $(\pi ,\xi )$—indeed, this holds because linear interpolations are geodesics in $L^p$ spaces—and (33) follows. $\square $

Remark 12

The proof of the triangle inequality above uses the notion of gluing, which appears in our category theoretic framework in Sect. 3. The coupling $\pi _{13}$ constructed above is written as $\pi _{13} = \pi _{23} \bullet \pi _{13}$ in the notation from Sect. 3. A byproduct of our proof is the following useful fact: for measure networks $N,N',N''$ and couplings $\pi \in \mathcal {C}(\mu ,\mu ')$ and $\pi ' \in \mathcal {C}(\mu ',\mu '')$, we have

$$\begin{aligned} \textrm{dis}_p^\mathcal {N}(\pi ' \bullet \pi ) \le \textrm{dis}_p^\mathcal {N}(\pi ') + \textrm{dis}_p^\mathcal {N}(\pi ). \end{aligned}$$

A similar statement holds in the measure hypernetwork setting.

1.2 Proofs of technical Lemmas from Sect. 3.6

Proof of Lemma 12

We will prove the statement for hypernetworks, as the network case is similar. Let $\pi $ and $\xi $ be couplings realizing $d_{\mathcal {H},p}(H,H')$ (such couplings always exist, see Lemma 24 in the appendix). Define

$$\begin{aligned} \overline{X} = \textrm{supp}(\pi ) \subset X \times X', \quad \overline{\mu } = \pi \mid _{\overline{X}}, \quad \overline{Y} = \textrm{supp}(\xi ) \subset Y \times Y', \quad \overline{\nu } = \xi \mid _{\overline{Y}} \end{aligned}$$

and define hypernetwork functions $\overline{\omega },\overline{\omega }':\overline{X} \times \overline{Y} \rightarrow {\mathbb {R}}$ as follows. Let $p_Z:X \times X' \rightarrow Z$ be coordinate projection for $Z \in \{X,X'\}$ and let $p_Z:Y \times Y' \rightarrow Z$ be coordinate projection for $Z \in \{Y,Y'\}$. The hypernetwork functions are then given by

$$\begin{aligned} \overline{\omega } = \omega \circ (p_X \times p_Y) \qquad \text{ and } \qquad \overline{\omega }' = \omega ' \circ (p_{X'} \times p_{Y'}). \end{aligned}$$

The coordinate projections are measure-preserving, due to the marginal constraints of $\pi $ and $\xi $, and the pair $(p_X,p_Y)$ (respectively, $(p_{X'},p_{Y'})$) therefore defines a basic weak isomorphism (Definition 6) from $\overline{H}$ to H (respectively, $\overline{H'}$ to $H'$).

To see (12), observe first that $d_{\mathcal {H},p}(H,H') = d_{\mathcal {H},p}(\overline{H},\overline{H'})$ follows by weak isomorphism. To see the other equality for $p \in [1,\infty )$, we have

$$\begin{aligned} d_{\mathcal {H},p}(H,H')^p&= \int _{Y \times Y'} \int _{X \times X'} |\omega (x,y) - \omega '(x',y')|^p \; \pi (dx \times dx') \xi (dy \times dy') \\&= \int _{\overline{Y}} \int _{\overline{X}} |\omega (x,y) - \omega '(x',y')|^p \; \overline{\mu }(dx \times dx') \overline{\nu }(dy \times dy'), \\ \end{aligned}$$

since the value of the integral isn’t changed by replacing the full spaces with supports of the measures. Using the change of variables $\overline{x} = (x,x') \in \overline{X}$ and $\overline{y} = (y,y') \in \overline{Y}$, the above is equal to

$$\begin{aligned} \int _{\overline{Y}} \int _{\overline{X}} |\overline{\omega }(\overline{x},\overline{y}) - \overline{\omega }'(\overline{x},\overline{y})|^p \; \overline{\mu }(d\overline{x}) \overline{\nu }(d\overline{y}) = \Vert \overline{\omega } - \overline{\omega }'\Vert _{L^p(\overline{\mu } \otimes \overline{\nu })}^p. \end{aligned}$$

For the $p = \infty $ case, a similar change of variables gives

$$\begin{aligned} d_{\mathcal {H},\infty }(H,H')&= \inf \{c \mid |\omega (x,y) - \omega '(x',y')|\le c \text{ for } \pi \otimes \xi -\text{ almost } \text{ every } (x,y,x',y')\} \\&= \inf \{c \mid |\overline{\omega }(\overline{x},\overline{y}) - \overline{\omega }'(\overline{x},\overline{y})|\\&\le c \text{ for } \overline{\mu } \otimes \overline{\xi }-\text{ almost } \text{ every } (\overline{x},\overline{y})\} \\&= \Vert \overline{\omega } - \overline{\omega }'\Vert _{L^\infty (\overline{\mu } \times \overline{\nu })}. \end{aligned}$$

$\square $

Proof of Lemma 13

By Proposition 2, it suffices to show that if there is a basic weak isomorphism $(\phi ,\psi )$ from H to $H'$, then $\phi $ defines a basic weak isomorphism of $N:= \textsf{Q}_q(H) = (X,\mu ,\omega _{\textsf{Q}_q})$ and $N':= \textsf{Q}_q(H') = (X',\mu ',\omega _{\textsf{Q}_q}')$. By an argument similar to the one used to prove functoriality in Theorem 11, the assumption that $(\phi ,\psi )$ is a basic weak isomorphism implies that for $\mu \otimes \mu \otimes \nu $-almost every $(x_1,x_2,y) \in X \times X \times Y$, $\min _j \omega (x_j,y) = \min _j \omega '(\phi (x_j),\psi (y))$. In the $q < \infty $ case, we deduce that for $\mu \otimes \mu $ almost every $(x_1,x_2)$,

$$\begin{aligned} \omega _{\textsf{Q}_q}'(\phi (x_1),\phi (x_2))^p&= \int _{Y'} \min _j \omega '(\phi (x_j),y')^p \; \nu ' (dy') \nonumber \\&= \int _{Y} \min _j \omega '(\phi (x_j),\psi (y))^p \; \nu (dy)\nonumber \\&= \int _{Y} \min _j \omega (x_j,y)^p \; \nu (dy) \nonumber \\&= \omega _{\textsf{Q}_q}(x_1,x_2)^p, \end{aligned}$$

(34)

where (34) follows by the change-of-variables formula (using that $\psi $ is measure-preserving). The $q=\infty $ case follows by a limiting argument. $\square $

Proof of Lemma 14

This is Mémoli (2011a, Theorem 5.1 (d)), in the metric measure space setting. The proof goes through in the measure network setting. $\square $

1.3 Runtime analysis for hypergraph simplification

We ran the experiments using the following computing infrastructures:

CPU models: 32 Intel Xeon Silver 4108 CPU 1.80 GHz cores (HT)
Amount of memory: 132 GB of RAM
Operating system: OpenSUSE Leap 15.3 (x86_64)

Table 1 summarizes the size of each dataset, including the number of hyperedges and the number of vertices contained in each hypergraph, the runtime of computing the barcode, the runtime of performing the simplification (simp.) at the first step, and the maximum runtime among all simplification steps for each dataset. Figure 12 demonstrates the distribution and the minimum runtime of computing the hypernetwork distances $d_{\mathcal {H},2}(H_0, H_i)$ at each step i. Table 2 summarizes the minimum runtime at the first step as well as the minimum, maximum, mean, and median values of the minimum runtime of computing the distances across all simplification steps for each dataset.

The largest dataset is the Coauthor Network dataset, which contains 506 nodes and 491 hyperedges. Therefore, this dataset gives rise to the largest runtime of computing the barcode, simplified hypergraphs, and hypernetwork distances. The maximum runtime among all steps is around 5 s for the Coauthor Network dataset.

Table 1 The number of vertices (#v), the number of hyperedges (#e), the runtime (in seconds) of computing the barcode, the runtime (in seconds) of performing the simplification at the first step, and the maximum runtime (in seconds) among all simplification steps in datasets ActiveDNS (DNS), Coauthor Network (CN), and Les Misérables (LM)

Full size table

Table 2 The minimum runtime (in seconds) at the first step as well as the minimum, maximum, mean, and median values of the minimum runtime (in seconds) of computing the distances across all simplification steps for each dataset

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chowdhury, S., Needham, T., Semrad, E. et al. Hypergraph co-optimal transport: metric and categorical properties. J Appl. and Comput. Topology 8, 1171–1230 (2024). https://doi.org/10.1007/s41468-023-00142-9

Download citation

Received: 22 June 2022
Revised: 01 February 2023
Accepted: 19 August 2023
Published: 30 September 2023
Issue Date: October 2024
DOI: https://doi.org/10.1007/s41468-023-00142-9