Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Hypergraph co-optimal transport: metric and categorical properties

  • Published:
Journal of Applied and Computational Topology Aims and scope Submit manuscript

Abstract

Hypergraphs capture multi-way relationships in data, and they have consequently seen a number of applications in higher-order network analysis, computer vision, geometry processing, and machine learning. In this paper, we develop theoretical foundations for studying the space of hypergraphs using ingredients from optimal transport. By enriching a hypergraph with probability measures on its nodes and hyperedges, as well as relational information capturing local and global structures, we obtain a general and robust framework for studying the collection of all hypergraphs. First, we introduce a hypergraph distance based on the co-optimal transport framework of Redko et al. and study its theoretical properties. Second, we formalize common methods for transforming a hypergraph into a graph as maps between the space of hypergraphs and the space of graphs, and study their functorial properties and Lipschitz bounds. Finally, we demonstrate the versatility of our Hypergraph Co-Optimal Transport (HyperCOT) framework through various examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Algorithm 1
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. Hypergraph Co-Optimal Transport: https://github.com/samirchowdhury/HyperCOT.

  2. https://github.com/PythonOT/COOT.

  3. https://pythonot.github.io/.

  4. Hypergraph Co-Optimal Transport: https://github.com/samirchowdhury/HyperCOT.

References

  • Agarwal, S., Lim, J., Zelnik-Manor, L., Perona, P., Kriegman, D., Belongie, S.: Beyond pairwise clustering. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 838–845 (2005)

  • Ambrosio, L., Gigli, N., Savaré, G.: Gradient Flows: In Metric Spaces and in the Space of Probability Measures. Birkhäuser Verlag, Basel (2005)

    Google Scholar 

  • Banerjee, A., Char, A., Mondal, B.: Spectra of general hypergraphs. Linear Algebra Appl. 1, 14–30 (2017)

    MathSciNet  Google Scholar 

  • Bermond, J.-C., Heydemann, M.-C., Sotteau, D.: Line graphs of hypergraphs I. Discrete Math. 18(3), 235–241 (1977)

    MathSciNet  Google Scholar 

  • Blumberg, A.J., Carriere, M., Mandell, M.A., Rabadan, R., Villar, S.: MREC: a fast and versatile framework for aligning and matching point clouds with applications to single cell molecular data. arXiv preprint arXiv:2001.01666 (2020)

  • Bochner, S.: Harmonic Analysis and the Theory of Probability. University of California Press, Berkeley (2020)

    Google Scholar 

  • Borceux, F.: Handbook of Categorical Algebra: Basic Category Theory, vol. 1. Cambridge University Press, Cambridge (1994)

    Google Scholar 

  • Bretto, A.: Hypergraph Theory: An Introduction. Springer, New York (2013)

    Google Scholar 

  • Bronstein, A.M., Bronstein, M.M., Kimmel, R.: Numerical Geometry of Non-rigid Shapes. Springer, New York (2008)

    Google Scholar 

  • Carlsson, G., Mémoli, F.: Persistent clustering and a theorem of J. Kleinberg. arXiv preprint arXiv:0808.2241 (2008)

  • Carlsson, G., Mémoli, F.: Characterization, stability and convergence of hierarchical clustering methods. J. Mach. Learn. Res. 11, 1425–1470 (2010)

    MathSciNet  Google Scholar 

  • Carlsson, G., Mémoli, F.: Classifying clustering schemes. Found. Comput. Math. 13(2), 221–252 (2013)

    MathSciNet  Google Scholar 

  • Carlsson, G., Mémoli, F., Ribeiro, A., Segarra, S.: Axiomatic construction of hierarchical clustering in asymmetric networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5219–5223 (2013)

  • Carlsson, G., Mémoli, F., Ribeiro, A., Segarra, S.: Hierarchical clustering of asymmetric networks. Adv. Data Anal. Classif. 12(1), 65–105 (2018)

    MathSciNet  Google Scholar 

  • Carlsson, G., Mémoli, F., Segarra, S.: Robust hierarchical clustering for directed networks: an axiomatic approach. SIAM J. Appl. Algebra Geom. 5(4), 675–700 (2021)

    MathSciNet  Google Scholar 

  • Cencetti, G., Battiston, F., Lepri, B., Karsai, M.: Temporal properties of higher-order interactions in social networks. Sci. Rep. 11, 7028 (2021)

    Google Scholar 

  • Chertok, M., Keller, Y.: Efficient high order matching. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2205–2215 (2010)

    Google Scholar 

  • Chowdhury, S., Miller, D., Needham, T.: Quantized Gromov–Wasserstein. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 811–827 (2021)

  • Chowdhury, S., Mémoli, F.: Explicit geodesics in Gromov–Hausdorff space. Electron. Res. Announc. Math. Sci. 25, 48 (2018)

    MathSciNet  Google Scholar 

  • Chowdhury, S., Mémoli, F.: The Gromov–Wasserstein distance between networks and stable network invariants. Inf. Inference J. IMA 8(4), 757–787 (2019)

    MathSciNet  Google Scholar 

  • Chowdhury, S., Needham, T.: Gromov–Wasserstein averaging in a Riemannian framework. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 842–843 (2020)

  • Chowdhury, S., Needham, T.: Generalized spectral clustering via Gromov–Wasserstein learning. In: International Conference on Artificial Intelligence and Statistics, pp. 712–720 (2021)

  • Culbertson, J., Guralnik, D.P., Hansen, J., Stiller, P.F.: Consistency constraints for overlapping data clustering. arXiv preprint arXiv:1608.04331 (2016)

  • Culbertson, J., Guralnik, D.P., Stiller, P.F.: Functorial hierarchical clustering with overlaps. Discrete Appl. Math. 236, 108–123 (2018)

    MathSciNet  Google Scholar 

  • Dey, T.K., Mémoli, F., Wang, Y.: Multiscale mapper: topological summarization via codomain covers. In: Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 997–1013 (2016)

  • Dörfler, W., Waller, D.: A category-theoretical approach to hypergraphs. Arch. Math. 34(1), 185–192 (1980)

    MathSciNet  Google Scholar 

  • Duchenne, O., Bach, F., Kweon, I.-S., Ponce, J.: A tensor-based algorithm for high-order graph matching. IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2383–2395 (2011)

    Google Scholar 

  • Flamary, R., Courty, N.: POT: Python Optimal Transport library (2017)

  • Flamary, R., Courty, N., Gramfort, A., Alaya, M.Z., Boisbunon, A., Chambon, S., Chapel, L., Corenflos, A., Fatras, K., Fournier, N., Gautheron, L., Gayraud, N.T.H., Janati, H., Rakotomamonjy, A., Redko, I., Rolet, A., Schutz, A., Seguy, V., Sutherland, D.J., Tavenard, R., Tong, A., Vayer, T.: Pot: Python optimal transport. J. Mach. Learn. Res. 22(78), 1–8 (2021)

    Google Scholar 

  • Folland, G.B.: Real Analysis: Modern Techniques and Their Applications. Wiley, Hoboken (1999)

    Google Scholar 

  • Gallo, G., Ülkücü, A.: Bilinear programming: an exact algorithm. Math. Program. 12(1), 173–194 (1977)

    MathSciNet  Google Scholar 

  • Gold, S., Rangarajan, A.: A graduated assignment algorithm for graph matching. IEEE Trans. Pattern Anal. Mach. Intell. 18(4), 377–388 (1996)

    Google Scholar 

  • Hell, P., Nesetril, J.: Graphs and Homomorphisms, vol. 28. OUP, Oxford (2004)

    Google Scholar 

  • Hendrikson, R.: Using Gromov–Wasserstein distance to explore sets of networks. Master’s thesis, University of Tartu (2016)

  • Isbell, J.R.: Six theorems about injective metric spaces. Commentarii Mathematici Helvetici 39(1), 65–76 (1964)

    MathSciNet  Google Scholar 

  • Karoński, M., Palka, Z.: On Marczewski–Steinhaus type distance between hypergraphs. Applicationes Mathematicae 16(1), 47–57 (1977)

    MathSciNet  Google Scholar 

  • Klamt, S., Haus, U.-U., Theis, F.: Hypergraphs and cellular networks. PLoS Comput. Biol. 5, e1000385 (2009)

    MathSciNet  Google Scholar 

  • Knuth, D.E.: The Stanford GraphBase: A Platform for Combinatorial Computing. ACM Press, New York (1994)

    Google Scholar 

  • Konstantinova, E.V., Skorobogatov, V.A.: Application of hypergraph theory in chemistry. Discrete Math. 235(1–3), 365–383 (2001)

    MathSciNet  Google Scholar 

  • Kountouras, A., Kintis, P., Lever, C., Chen, Y., Nadji, Y., Dagon, D., Antonakakis, M., Joffe, R.: Enabling network security through active DNS datasets. In: International Symposium on Research in Attacks, Intrusions, and Defenses, pp. 188–208 (2016)

  • Lawvere, F.W.: The category of probabilistic mappings. Preprint (1962)

  • Lee, J., Cho, M., Lee, K.M.: Hyper-graph matching via reweighted random walks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011)

  • Li, M., Palande, S., Yan, L., Wang, B.: Sketching merge trees for scientific data visualization. arXiv preprint arXiv:2101.03196 (2021)

  • Lin, C.-L.: Hardness of approximating graph transformation problem. In: Algorithms and Computation: 5th International Symposium, pp. 74–82 (1994)

  • Lotito, Q.F., Musciotto, F., Montresor, A., Battiston, F.: Higher-order motif analysis in hypergraphs. Commun. Phys. 5, 79 (2022)

    Google Scholar 

  • Mémoli, F.: On the use of Gromov-Hausdorff distances for shape comparison. In: Eurographics Symposium on Point-Based Graphics, pp. 81–90 (2007)

  • Mémoli, F.: Gromov–Wasserstein distances and the metric approach to object matching. Found. Comput. Math. 11(4), 417–487 (2011a)

  • Mémoli, F.: A spectral notion of Gromov–Wasserstein distance and related methods. Appl. Comput. Harmon. Anal. 30(3), 363–401 (2011b)

  • Mutlu, A., Gürdal, U.: Bipolar metric spaces and some fixed point theorems. J. Nonlinear Sci. Appl. 9(9), 5362–5373 (2016)

    MathSciNet  Google Scholar 

  • Patania, A., Petri, G., Vaccarino, F.: The shape of collaborations. PJ Data Sci. 6, 18 (2017)

    Google Scholar 

  • Perrone, P.: Lifting couplings in wasserstein spaces. arXiv preprint arXiv: 2110.06591 (2021)

  • Peyré, G., Cuturi, M., Solomon, J.: Gromov–Wasserstein averaging of kernel and distance matrices. In: International Conference on Machine Learning, pp. 2664–2672 (2016)

  • Peyré, G., Cuturi, M., et al.: Computational optimal transport: with applications to data science. Found. Trends Mach. Learn. 11(5–6), 355–607 (2019)

    Google Scholar 

  • Pu, L., Faltings, B.: Hypergraph learning with hyperedge expansion. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 410–425 (2012)

  • Redko, I., Vayer, T., Flamary, R., Courty, N.: Co-optimal transport. In: Advances in Neural Information Processing Systems, vol. 33, pp. 17559–17570 (2020)

  • Reuter, M., Wolter, F.-E., Peinecke, N.: Laplace-Beltrami spectra as ‘Shape-DNA’ of surfaces and solids. Comput. Aided Des. 38(4), 342–366 (2006)

    Google Scholar 

  • Rodri’guez, J.A.: On the Laplacian eigenvalues and metric parameters of hypergraphs. Linear Multilinear Algebra 50(1), 1–14 (2002)

    MathSciNet  Google Scholar 

  • Scetbon, M., Peyré, G., Cuturi, M.: Linear-time gromov wasserstein distances using low rank couplings and costs. In: International Conference on Machine Learning, pp. 19347–19365 (2022)

  • Schmidt, G., Ströhlein, T.: Relations and Graphs: Discrete Mathematics for Computer Scientists. Springer, Berlin (2012)

    Google Scholar 

  • Singh, G., Mémoli, F., Carlsson, G.E.: Topological methods for the analysis of high dimensional data sets and 3d object recognition. In: The IEEE Eurographics Symposium on Point-Based Graphics (PBG), vol. 2, pp. 091–100 (2007)

  • Smaniotto, S., Pelillo, M.: Two metrics for attributed hypergraphs. Pattern Recogn. Lett. 149, 143–149 (2021)

    Google Scholar 

  • Solomon, J.: Optimal transport on discrete domains. AMS Short Course on Discrete Differential Geometry (2018)

  • Solomon, J., Peyré, G., Kim, V.G., Sra, S.: Entropic metric alignment for correspondence problems. ACM Trans. Graph. (TOG) 35(4), 1–13 (2016)

    Google Scholar 

  • Sturm, K.-T.: On the geometry of metric measure spaces. Acta Math. 196(1), 65–131 (2006)

    MathSciNet  Google Scholar 

  • Sturm, K.-T.: The space of spaces: curvature bounds and gradient flows on the space of metric measure spaces. arXiv preprint arXiv:1208.0434 (2012)

  • Sun, L., Ji, S., Ye, J.: Hypergraph spectral learning for multi-label classification. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 668–676 (2008)

  • Surana, A., Chen, C., Rajapakse, I.: Hypergraph dissimilarity measures. arXiv preprint arXiv:2106.08206 (2021)

  • Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: ArnetMiner: extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 990–998 (2008)

  • Ulam, S.M.: Some ideas and prospects in biomathematics. Annu. Rev. Biophys. Bioeng. 1, 227–292 (1972)

    Google Scholar 

  • Umeyama, S.: An eigen decomposition approach to weighted graph matching problems. IEEE Trans. Pattern Anal. Mach. Intell. 10(5), 695–703 (1988)

    Google Scholar 

  • Vayer, T., Courty, N., Tavenard, R., Flamary, R.: Optimal transport for structured data with application on graphs. In: International Conference on Machine Learning, pp. 6275–6284 (2019)

  • Villani, C.: Topics in Optimal Transportation, vol. 58. American Mathematical Society, Providence, Rhode Island (2003)

    Google Scholar 

  • Xia, X., Yin, H., Yu, J., Wang, Q., Cui, L., Zhang, X.: Self-supervised hypergraph convolutional networks for session-based recommendation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 4503–4511 (2021)

  • Xu, H., Luo, D., Carin, L.: Scalable Gromov–Wasserstein learning for graph partitioning and matching. In: Advances in Neural Information Processing Systems, pp. 3046–3056 (2019a)

  • Xu, H., Luo, D., Zha, H., Carin, L.: Gromov–Wasserstein learning for graph matching and node embedding. In: International Conference on Machine Learning, pp. 6932–6941 (2019b)

  • Xu, H.: Gromov–Wasserstein factorization models for graph clustering. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 6478–6485 (2020)

  • Zeng, Z., Tung, A.K.H., Wang, J., Feng, J., Zhou, L.: Comparing stars: on approximating graph edit distance. Proc. VLDB Endow. 2(1), 25–36 (2009)

    Google Scholar 

  • Zhou, Y., Rathore, A., Purvine, E., Wang, B.: Topological simplifications of hypergraphs. IEEE Trans. Visual Comput. Graphics 29(7), 3209–3225 (2023). https://doi.org/10.1109/TVCG.2022.3153895

  • Zien, J., Schlag, M., Chan, P.K.: Multi-level spectral hypergraph partitioning with arbitrary vertex sizes. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 18, 1389–1399 (1999)

    Google Scholar 

Download references

Acknowledgements

We would like to thank the anonymous referees for numerous useful suggestions. This work was partially supported by NSF DMS 2107808, NSF IIS 1910733, NSF IIS 214549 and DOE DE-SC0021015.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tom Needham.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Proofs from Sect. 2.3

To prove Theorem 1, we begin with a technical result.

Lemma 24

The infimum in the definition of \(d_{\mathcal {H},p}\) is realized.

Proof

For a Polish space X, let \(\textrm{Prob}(X)\) denote the set of Borel probability measures over X, endowed with the weak topology.

Let \(H = (X,\mu ,Y,\nu ,\omega ), H' = (X',\mu ',Y',\nu ',\omega ') \in \mathcal {H}\). As X and \(X'\) are Polish spaces and \(\mu \) and \(\mu '\) are Borel probability measures, we have that \(\mathcal {C}(\mu ,\mu ')\) is compact in \(\textrm{Prob}(X\times X')\), by Chowdhury and Mémoli (2019, Lemma 10). It follows that \(\mathcal {C}(\mu ,\mu ') \times \mathcal {C}(\nu ,\nu ')\) is compact in \(\textrm{Prob}(X\times X') \times \textrm{Prob}(Y \times Y')\).

Now we will show that the distortion functional \(\textrm{dis}_p^{\mathcal {H}} = \textrm{dis}_{H,H',p}^\mathcal {H}\) is continuous on \(\mathcal {C}(\mu ,\mu ') \times \mathcal {C}(\nu ,\nu ')\), beginning with the \(1\le p < \infty \) case. Since we are working in Polish spaces with finite measures, we have that bounded continuous functions are dense. So for every \(n\in \mathbb {N}\), we can choose continuous bounded functions \(\omega _n\in L^p(\mu \otimes \nu )\) and \(\omega '_n\in L^p(\mu '\otimes \nu ')\) such that the following holds for all \(x\in X, y\in Y, x'\in X',\) and \( y'\in Y'\):

$$\begin{aligned} \Vert \omega (x,y)-\omega _n(x,y)\Vert _{L^p(\mu \otimes \nu )} \le \frac{1}{n} \quad \text{ and } \quad \Vert \omega '(x',y')-\omega '_n(x',y')\Vert _{L^p(\mu '\otimes \nu ')} \le \frac{1}{n}. \end{aligned}$$

Then for each \(n\in \mathbb {N}\) we define a n-distortion functional \(\textrm{dis}_p^n: \mathcal {C}(\mu ,\mu ')\times \mathcal {C}(\nu ,\nu ')\longrightarrow \mathbb {R}_+\) as \(\textrm{dis}_p^n(\pi ,\xi ):= \Vert \omega _n(x,y) - \omega '_n(x',y') \Vert _{L^p(\pi \otimes \xi )}\)

Now let us show that the n-distortion functional is continuous. Let \((\pi ,\xi )\in \mathcal {C}(\mu ,\mu ')\times \mathcal {C}(\nu ,\nu ')\) and let \((\pi _m)_{m\in \mathbb {N}}\) and \((\xi _m)_{m\in \mathbb {N}}\) be sequences in \(\mathcal {C}(\mu ,\mu ')\) and \(\mathcal {C}(\nu ,\nu ')\) that converge to \(\pi \) and \(\xi \) with respect to the weak topology. Then

$$\begin{aligned}&\lim _{m\rightarrow \infty }\text {dis}_p^n(\pi _m,\xi _m) \end{aligned}$$
(29)
$$\begin{aligned}&\quad = \lim _{m\rightarrow \infty }\left( \int _{Y\times Y'}\int _{X\times X'} |\omega _n(x,y) - \omega '_n(x',y') |^p \pi _m (dx \times dx')\xi _m(dy \times dy')\right) ^{\frac{1}{p}} \end{aligned}$$
(30)
$$\begin{aligned}&\quad = \left( \int _{Y\times Y'}\int _{X\times X'} |\omega _n(x,y) - \omega '_n(x',y') |^p\pi (dx \times dx')\xi (dy \times dy')\right) ^{\frac{1}{p}} \nonumber \\&\quad = \text {dis}_p^n(\pi ,\xi ). \end{aligned}$$
(31)

So we have that \(\text {dis}_p^n\) is sequentially continuous. As \(\mathcal {C}(\mu ,\mu ')\times \mathcal {C}(\nu ,\nu ') \subseteq \) Prob\((X\times X'\times Y\times Y')\), it is a metrizable space (Ambrosio et al. (2005), Remark 5.1.1). Therefore, as \(\text {dis}_p^n\) is sequentially continuous in a metrizable space, it is also continuous.

Now let us see that \(\text {dis}_p^n\) converges to \(\text {dis}^\mathcal {H}_p\) uniformly. Let \((\pi ,\xi )\in \mathcal {C}(\mu ,\mu ')\times \mathcal {C}(\nu ,\nu ')\) and observe

$$\begin{aligned}&|\text {dis}_p^{\mathcal {H}}(\pi ,\xi ) - \text {dis}_p^n(\pi ,\xi ) |\\&\quad = |\Vert \omega (x,y) - \omega '(x',y') \Vert _{L^p(\pi \otimes \xi )}-\Vert \omega _n(x,y) - \omega '_n(x',y') \Vert _{L^p(\pi \otimes \xi )} |\\&\quad \le \Vert \omega (x,y) - \omega '(x',y') - \omega _n(x,y) + \omega '_n(x',y') \Vert _{L^p(\pi \otimes \xi )} \\&\quad \le \Vert \omega (x,y)-\omega _n(x,y)\Vert _{L^p(\mu \otimes \nu )} +\Vert \omega '(x',y')-\omega '_n(x',y')\Vert _{L^p(\mu '\otimes \nu ')} \le \frac{2}{n}. \end{aligned}$$

Since \(\textrm{dis}_p^{\mathcal {H}}\) is the uniform limit of continuous functions, it is also continuous. It follows that the infimum in (4) is realized with a minimal coupling.

For the case in which \(p=\infty \), let \((\pi ,\xi )\) be couplings in \(\mathcal {C}(\mu ,\mu ')\times \mathcal {C}(\nu ,\nu ')\). As the domain of \(\textrm{dis}_p\), \(\mathcal {C}(\mu ,\mu ')\times \mathcal {C}(\nu ,\nu ')\), is a subset of Prob\((X\times X'\times Y\times Y')\), Jensen’s Inequality shows that if \(1\le p\le q\le \infty \), then we have that \(\textrm{dis}_p^{\mathcal {H}} \le \textrm{dis}_q^{\mathcal {H}}\). Also note that \(\lim _{p\rightarrow \infty } \textrm{dis}_p^{\mathcal {H}}(\pi ,\xi ) =\textrm{dis}_\infty ^{\mathcal {H}}(\pi ,\xi )\). Then as for \(p\in [1,\infty ), \textrm{dis}_p^{\mathcal {H}}\) is continuous, and \(\textrm{dis}_{\infty }^{\mathcal {H}} = \sup \{\textrm{dis}_p^{\mathcal {H}} \, \vert \, p \in [1,\infty ) \}\) we have that \(\textrm{dis}_\infty ^{\mathcal {H}}\) is lower semicontinuous. As \(\textrm{dis}_p^\mathcal {H}\) is at least lower semicontinuous over a compact domain, we have that the infimum in (4) is realized with a minimal coupling for any \(p\in [1,\infty ]\). \(\square \)

Proof of Theorem 1

In the following, let \(H=(X,\mu ,Y,\nu ,\omega )\), \(H'=(X',\mu ',Y',\nu ',\omega ')\) and \(H'' = (X'',\mu '',Y'',\nu '',\omega '')\) be measure hypernetworks.

Pseudometric. Symmetry is obvious. For the triangle inequality, let \((\pi _{12},\xi _{12})\) and \((\pi _{23},\xi _{23})\) be couplings realizing \(d_{\mathcal {H},p}(H,H')\) and \(d_{\mathcal {H},p}(H',H'')\), respectively (see Lemma 24). Then, using the gluing lemma (see, e.g., (Sturm 2012, Lemma 1.4)), we construct a probability measure \(\pi \) on \(X \times X' \times X''\) such that the \(X \times X'\) marginal of \(\pi \) is \(\pi _{12}\), the \(X' \times X''\) marginal is \(\pi _{23}\) and the \(X \times X''\) marginal is a coupling of \(\mu \) and \(\mu ''\), which we denote \(\pi _{13}\). Likewise, we construct a probability measure \(\xi \) on \(Y \times Y' \times Y''\) with analogous properties—in particular, the \(Y \times Y''\) marginal is denoted \(\xi _{13}\). Then, using the various marginal conditions, we have

$$\begin{aligned}&d_{\mathcal {H},p}(H,H'') \\&\quad \le \textrm{dis}_p^\mathcal {H}(\pi _{13},\xi _{13}) \\&\quad =\Vert \omega -\omega ''\Vert _{L^p(\pi \otimes \xi )}\\&\quad \le \Vert \omega -\omega '\Vert _{L^p(\pi \otimes \xi )}+\Vert \omega '-\omega ''\Vert _{L^p(\pi \otimes \xi )}\\&\quad =\Vert \omega -\omega '\Vert _{L^p(\pi _{12}\otimes \xi _{12})}+\Vert \omega '-\omega ''\Vert _{L^p(\pi _{23}\otimes \xi _{23})} = d_{\mathcal {H},p}(H,H')+d_{\mathcal {H},p}(H',H''). \end{aligned}$$

Complete. We prove completeness by adapting the proof of Sturm (2012, Theorem 5.8). Let \(H_n = (X_n,\mu _n,Y_n,\nu _n,\omega _n)\) be a Cauchy sequence of measure hypernetworks. Then we can choose couplings \(\pi _n \in \mathcal {C}(\mu _{n},\mu _{n+1})\) and \(\xi _n \in \mathcal {C}(\nu _n,\nu _{n+1})\) such that \(\textrm{dis}_p^{\mathcal {H}}(\pi _n,\xi _n) \rightarrow 0\). We then apply the generalized Gluing Lemma (Sturm 2012, Lemma 1.4) to \(\pi _1,\ldots ,\pi _{N-1}\) (respectively, to \(\xi _1,\ldots ,\xi _{N-1}\)) to construct a measure \(\Pi _N\) on \(\prod _{n=1}^N X_n\) (respectively, \(\Xi _N\) on \(\prod _{n=1}^N Y_N\)). We now define \(\overline{\mu }\) to be the projective limit \(\varprojlim \Pi _N\), a well-defined probability measure on \(\prod _{n=1}^\infty X_n\) (e.g., Bochner 2020). Set \(\overline{X}:= \textrm{supp}(\overline{\mu })\); then \(\overline{X}\) is a closed subspace of a countable product of Polish spaces and is therefore Polish. Similarly, define \(\overline{\nu }:= \varprojlim \Xi _N\) and \(\overline{Y}:= \textrm{supp}(\overline{\nu })\). Next we define \(\Omega _N: \overline{X} \times \overline{Y} \rightarrow {\mathbb {R}}\) by

$$\begin{aligned} \Omega _N((x_i)_{i=1}^\infty ,(y_i)_{i=1}^\infty ) = \omega _N(x_N,y_N). \end{aligned}$$

The sequence \((\Omega _N)_{N=1}^\infty \) is a Cauchy sequence in the Banach space \(L^p(\overline{X} \times \overline{Y}, \overline{\mu } \otimes \overline{\nu })\), since

$$\begin{aligned} \Vert \Omega _N - \Omega _{N+1}\Vert _{L^p(\overline{\mu } \otimes \overline{\nu })} = \textrm{dis}_p^{\mathcal {H}}(\pi _N,\xi _N) \rightarrow 0, \end{aligned}$$

and we define \(\overline{\omega }\) to be its limit. Each \(\omega _N\) is essentially bounded, so it follows that \(\overline{\omega }\) is as well. Setting \(\overline{H}:= (\overline{X},\overline{\mu },\overline{Y},\overline{\nu },\overline{\omega })\), we have \(\overline{H} \in \mathcal {H}\) and

$$\begin{aligned} d_{\mathcal {H},p}(H_n,\overline{H}) \le \Vert \omega _n - \overline{\omega }\Vert _{L^p(\overline{\mu } \otimes \overline{\nu })} \rightarrow 0. \end{aligned}$$

Geodesic. Finally, we show that the metric on \(\mathcal {H}\) modulo weak isomorphism is geodesic. For \(H \in \mathcal {H}\), let [H] denote its weak isomorphism class. We will construct an explicit geodesic between any two weak isomorphism classes of hypernetworks [H] and \([H']\), following the main idea of the proof of Sturm (2012, Theorem 3.1). Let \(\pi \in \mathcal {C}(\mu ,\mu ')\) and \(\xi \in \mathcal {C}(\nu ,\nu ')\) be optimal couplings (which exist, by Lemma 24). Consider the path \(\gamma :[0,1] \rightarrow \mathcal {H}\) defined by

$$\begin{aligned} \gamma (t) = \left( \textrm{supp}(\pi ), \pi , \textrm{supp}(\xi ), \xi , \omega _t\right) , \end{aligned}$$
(32)

where

$$\begin{aligned} \omega _t((x,x'),(y,y')) = (1-t)\omega (x,y) + t \omega '(x',y'). \end{aligned}$$

Observe that \(\gamma (0) \in [H]\) and \(\gamma (1) \in [H']\); indeed, coordinate projection maps \(\phi :X \times X' \rightarrow X\) and \(\psi :Y \times Y' \rightarrow Y\) define a basic weak isomorphism \(\gamma (0) \rightarrow H\) and a similar construction works for \(\gamma (1) \rightarrow H'\). To show that \(\gamma \) defines a geodesic in \([\mathcal {H}]\), it suffices to show that for all \(0 \le s \le t \le 1\),

$$\begin{aligned} d_{\mathcal {H},p}(\gamma (s),\gamma (t)) \le (t-s) d_{\mathcal {H},p}(H,H') \end{aligned}$$
(33)

(see, e.g., (Chowdhury and Mémoli 2018, Lemma 1.3)). One can show by an elementary computation that

$$\begin{aligned} \textrm{dis}_{\gamma (s),\gamma (t),p}^{\mathcal {H}}(\pi ,\xi )^p \le (t-s)^p \textrm{dis}_{H,H',p}^{\mathcal {H}}(\pi ,\xi )^p \end{aligned}$$

for any pair of couplings \((\pi ,\xi )\)—indeed, this holds because linear interpolations are geodesics in \(L^p\) spaces—and (33) follows. \(\square \)

Remark 12

The proof of the triangle inequality above uses the notion of gluing, which appears in our category theoretic framework in Sect. 3. The coupling \(\pi _{13}\) constructed above is written as \(\pi _{13} = \pi _{23} \bullet \pi _{13}\) in the notation from Sect. 3. A byproduct of our proof is the following useful fact: for measure networks \(N,N',N''\) and couplings \(\pi \in \mathcal {C}(\mu ,\mu ')\) and \(\pi ' \in \mathcal {C}(\mu ',\mu '')\), we have

$$\begin{aligned} \textrm{dis}_p^\mathcal {N}(\pi ' \bullet \pi ) \le \textrm{dis}_p^\mathcal {N}(\pi ') + \textrm{dis}_p^\mathcal {N}(\pi ). \end{aligned}$$

A similar statement holds in the measure hypernetwork setting.

1.2 Proofs of technical Lemmas from Sect. 3.6

Proof of Lemma 12

We will prove the statement for hypernetworks, as the network case is similar. Let \(\pi \) and \(\xi \) be couplings realizing \(d_{\mathcal {H},p}(H,H')\) (such couplings always exist, see Lemma 24 in the appendix). Define

$$\begin{aligned} \overline{X} = \textrm{supp}(\pi ) \subset X \times X', \quad \overline{\mu } = \pi \mid _{\overline{X}}, \quad \overline{Y} = \textrm{supp}(\xi ) \subset Y \times Y', \quad \overline{\nu } = \xi \mid _{\overline{Y}} \end{aligned}$$

and define hypernetwork functions \(\overline{\omega },\overline{\omega }':\overline{X} \times \overline{Y} \rightarrow {\mathbb {R}}\) as follows. Let \(p_Z:X \times X' \rightarrow Z\) be coordinate projection for \(Z \in \{X,X'\}\) and let \(p_Z:Y \times Y' \rightarrow Z\) be coordinate projection for \(Z \in \{Y,Y'\}\). The hypernetwork functions are then given by

$$\begin{aligned} \overline{\omega } = \omega \circ (p_X \times p_Y) \qquad \text{ and } \qquad \overline{\omega }' = \omega ' \circ (p_{X'} \times p_{Y'}). \end{aligned}$$

The coordinate projections are measure-preserving, due to the marginal constraints of \(\pi \) and \(\xi \), and the pair \((p_X,p_Y)\) (respectively, \((p_{X'},p_{Y'})\)) therefore defines a basic weak isomorphism (Definition 6) from \(\overline{H}\) to H (respectively, \(\overline{H'}\) to \(H'\)).

To see (12), observe first that \(d_{\mathcal {H},p}(H,H') = d_{\mathcal {H},p}(\overline{H},\overline{H'})\) follows by weak isomorphism. To see the other equality for \(p \in [1,\infty )\), we have

$$\begin{aligned} d_{\mathcal {H},p}(H,H')^p&= \int _{Y \times Y'} \int _{X \times X'} |\omega (x,y) - \omega '(x',y')|^p \; \pi (dx \times dx') \xi (dy \times dy') \\&= \int _{\overline{Y}} \int _{\overline{X}} |\omega (x,y) - \omega '(x',y')|^p \; \overline{\mu }(dx \times dx') \overline{\nu }(dy \times dy'), \\ \end{aligned}$$

since the value of the integral isn’t changed by replacing the full spaces with supports of the measures. Using the change of variables \(\overline{x} = (x,x') \in \overline{X}\) and \(\overline{y} = (y,y') \in \overline{Y}\), the above is equal to

$$\begin{aligned} \int _{\overline{Y}} \int _{\overline{X}} |\overline{\omega }(\overline{x},\overline{y}) - \overline{\omega }'(\overline{x},\overline{y})|^p \; \overline{\mu }(d\overline{x}) \overline{\nu }(d\overline{y}) = \Vert \overline{\omega } - \overline{\omega }'\Vert _{L^p(\overline{\mu } \otimes \overline{\nu })}^p. \end{aligned}$$

For the \(p = \infty \) case, a similar change of variables gives

$$\begin{aligned} d_{\mathcal {H},\infty }(H,H')&= \inf \{c \mid |\omega (x,y) - \omega '(x',y')|\le c \text{ for } \pi \otimes \xi -\text{ almost } \text{ every } (x,y,x',y')\} \\&= \inf \{c \mid |\overline{\omega }(\overline{x},\overline{y}) - \overline{\omega }'(\overline{x},\overline{y})|\\&\le c \text{ for } \overline{\mu } \otimes \overline{\xi }-\text{ almost } \text{ every } (\overline{x},\overline{y})\} \\&= \Vert \overline{\omega } - \overline{\omega }'\Vert _{L^\infty (\overline{\mu } \times \overline{\nu })}. \end{aligned}$$

\(\square \)

Proof of Lemma 13

By Proposition 2, it suffices to show that if there is a basic weak isomorphism \((\phi ,\psi )\) from H to \(H'\), then \(\phi \) defines a basic weak isomorphism of \(N:= \textsf{Q}_q(H) = (X,\mu ,\omega _{\textsf{Q}_q})\) and \(N':= \textsf{Q}_q(H') = (X',\mu ',\omega _{\textsf{Q}_q}')\). By an argument similar to the one used to prove functoriality in Theorem 11, the assumption that \((\phi ,\psi )\) is a basic weak isomorphism implies that for \(\mu \otimes \mu \otimes \nu \)-almost every \((x_1,x_2,y) \in X \times X \times Y\), \(\min _j \omega (x_j,y) = \min _j \omega '(\phi (x_j),\psi (y))\). In the \(q < \infty \) case, we deduce that for \(\mu \otimes \mu \) almost every \((x_1,x_2)\),

$$\begin{aligned} \omega _{\textsf{Q}_q}'(\phi (x_1),\phi (x_2))^p&= \int _{Y'} \min _j \omega '(\phi (x_j),y')^p \; \nu ' (dy') \nonumber \\&= \int _{Y} \min _j \omega '(\phi (x_j),\psi (y))^p \; \nu (dy)\nonumber \\&= \int _{Y} \min _j \omega (x_j,y)^p \; \nu (dy) \nonumber \\&= \omega _{\textsf{Q}_q}(x_1,x_2)^p, \end{aligned}$$
(34)

where (34) follows by the change-of-variables formula (using that \(\psi \) is measure-preserving). The \(q=\infty \) case follows by a limiting argument. \(\square \)

Proof of Lemma 14

This is Mémoli (2011a, Theorem 5.1 (d)), in the metric measure space setting. The proof goes through in the measure network setting. \(\square \)

1.3 Runtime analysis for hypergraph simplification

We ran the experiments using the following computing infrastructures:

  • CPU models: 32 Intel Xeon Silver 4108 CPU 1.80 GHz cores (HT)

  • Amount of memory: 132 GB of RAM

  • Operating system: OpenSUSE Leap 15.3 (x86_64)

Table 1 summarizes the size of each dataset, including the number of hyperedges and the number of vertices contained in each hypergraph, the runtime of computing the barcode, the runtime of performing the simplification (simp.) at the first step, and the maximum runtime among all simplification steps for each dataset. Figure 12 demonstrates the distribution and the minimum runtime of computing the hypernetwork distances \(d_{\mathcal {H},2}(H_0, H_i)\) at each step i. Table 2 summarizes the minimum runtime at the first step as well as the minimum, maximum, mean, and median values of the minimum runtime of computing the distances across all simplification steps for each dataset.

The largest dataset is the Coauthor Network dataset, which contains 506 nodes and 491 hyperedges. Therefore, this dataset gives rise to the largest runtime of computing the barcode, simplified hypergraphs, and hypernetwork distances. The maximum runtime among all steps is around 5 s for the Coauthor Network dataset.

Table 1 The number of vertices (#v), the number of hyperedges (#e), the runtime (in seconds) of computing the barcode, the runtime (in seconds) of performing the simplification at the first step, and the maximum runtime (in seconds) among all simplification steps in datasets ActiveDNS (DNS), Coauthor Network (CN), and Les Misérables (LM)
Fig. 12
figure 12

Runtime in computing the GW distances for activeDNS (a), Les Misérables (b) and Coauthor Network (c) datasets. The x-axes display the simplification steps. The y-axes are time recorded in seconds

Table 2 The minimum runtime (in seconds) at the first step as well as the minimum, maximum, mean, and median values of the minimum runtime (in seconds) of computing the distances across all simplification steps for each dataset

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chowdhury, S., Needham, T., Semrad, E. et al. Hypergraph co-optimal transport: metric and categorical properties. J Appl. and Comput. Topology 8, 1171–1230 (2024). https://doi.org/10.1007/s41468-023-00142-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41468-023-00142-9

Keywords

Mathematics Subject Classification

Navigation