Ground Metric Learning on Graphs

Matthieu Heitz ORCID: orcid.org/0000-0003-1552-1688¹,
Nicolas Bonneel¹,
David Coeurjolly¹,
Marco Cuturi² &
…
Gabriel Peyré³

918 Accesses
4 Altmetric
Explore all metrics

Abstract

Optimal transport (OT) distances between probability distributions are parameterized by the ground metric they use between observations. Their relevance for real-life applications strongly hinges on whether that ground metric parameter is suitably chosen. The challenge of selecting it adaptively and algorithmically from prior knowledge, the so-called ground metric learning (GML) problem, has therefore appeared in various settings. In this paper, we consider the GML problem when the learned metric is constrained to be a geodesic distance on a graph that supports the measures of interest. This imposes a rich structure for candidate metrics, but also enables far more efficient learning procedures when compared to a direct optimization over the space of all metric matrices. We use this setting to tackle an inverse problem stemming from the observation of a density evolving with time; we seek a graph ground metric such that the OT interpolation between the starting and ending densities that result from that ground metric agrees with the observed evolution. This OT dynamic framework is relevant to model natural phenomena exhibiting displacements of mass, such as the evolution of the color palette induced by the modification of lighting and materials.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Fig. 13

Learned Approximate Distance Labels for Graphs

High-Dimensional Approximate r-Nets

Article 07 January 2020

Learning the Graph Edit Costs: What Do We Want to Optimise?

Notes

https://github.com/matthieuheitz/2020-JMIV-ground-metric-learning-graphs

References

Agueh, M., Carlier, G.: Barycenters in the Wasserstein Space. SIAM J. Math. Anal. 43(2), 904–924 (2011). https://doi.org/10.1137/100805741
Article MathSciNet MATH Google Scholar
Altschuler, J., Bach, F., Rudi, A., Weed, J.: Massively scalable Sinkhorn distances via the Nyström method. arXiv:1812.05189 [cs, math, stat] (2018)
Altschuler, J., Weed, J., Rigollet, P.: Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. arXiv preprint arXiv:1705.09634 (2017)
Angenent, S., Haker, S., Tannenbaum, A.: Minimizing flows for the Monge–Kantorovich problem. SIAM J. Math. Anal. 35(1), 61–97 (2003). https://doi.org/10.1137/S0036141002410927
Article MathSciNet MATH Google Scholar
Bellet, A., Habrard, A., Sebban, M.: Metric Learning. Synthesis Digital Library of Engineering and Computer Science. San Rafael, California (1537 Fourth Street, San Rafael, CA 94901 USA): Morgan & Claypool (2015)
Benamou, J.D., Carlier, G., Cuturi, M., Nenna, L., Peyré, G.: Iterative Bregman projections for regularized transportation problems. SIAM J. Sci. Comput. 37(2), A1111–A1138 (2015)
Article MathSciNet Google Scholar
Benmansour, F., Carlier, G., Peyré, G., Santambrogio, F.: Derivatives with respect to metrics and applications: subgradient marching algorithm. Numerische Mathematik 116(3), 357–381 (2010). https://doi.org/10.1007/s00211-010-0305-8
Article MathSciNet MATH Google Scholar
Bonneel, N., Peyré, G., Cuturi, M.: Wasserstein barycentric coordinates: Histogram regression using optimal transport. ACM Trans. Graph. 35(4), 1–10 (2016). https://doi.org/10.1145/2897824.2925918
Article Google Scholar
Brickell, J., Dhillon, I.S., Sra, S., Tropp, J.A.: The metric nearness problem. SIAM J. Matrix Anal. Appl. 30(1), 375–396 (2008). https://doi.org/10.1137/060653391
Article MathSciNet MATH Google Scholar
Buttazzo, G., Davini, A., Fragalà, I., Macià, F.: Optimal Riemannian distances preventing mass transfer. Journal für die reine und angewandte Mathematik (Crelles Journal) (2004). https://doi.org/10.1515/crll.2004.077
Article MathSciNet MATH Google Scholar
Chechik, G., Shalit, U., Sharma, V., Bengio, S.: An Online Algorithm for Large Scale Image Similarity Learning. In: Advances in Neural Information Processing Systems p. 9 (2009)
Chizat, L., Peyré, G., Schmitzer, B., Vialard, F.X.: Scaling algorithms for unbalanced transport problems. arXiv:1607.05816 (2016)
Chopra, S., Hadsell, R., LeCun, Y.: Learning a Similarity Metric Discriminatively, with Application to Face Verification. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 539–546. IEEE, San Diego, CA, USA (2005). https://doi.org/10.1109/CVPR.2005.202
Courty, N., Flamary, R., Tuia, D., Rakotomamonjy, A.: Optimal transport for domain adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 39(9), 1853–1865 (2016). https://doi.org/10.1109/TPAMI.2016.2615921
Article Google Scholar
Crane, K., Weischedel, C., Wardetzky, M.: Geodesics in heat: a new approach to computing distance based on heat flow. ACM Trans. Graph. (TOG) 32(5), 152 (2013)
Article Google Scholar
Cuturi, M.: Sinkhorn distances: Lightspeed computation of optimal transport. In: Advances in Neural Information Processing Systems, pp. 2292–2300 (2013)
Cuturi, M., Avis, D.: Ground metric learning. J. Mach. Learn. Res. 15(1), 533–564 (2014)
MathSciNet MATH Google Scholar
Cuturi, M., Doucet, A.: Fast computation of Wasserstein barycenters. In: International Conference on Machine Learning, pp. 685–693 (2014)
Dognin, P., Melnyk, I., Mroueh, Y., Ross, J., Santos, C.D., Sercu, T.: Wasserstein Barycenter Model Ensembling. arXiv:1902.04999 [cs, stat] (2019)
Dupuy, A., Galichon, A., Sun, Y.: Estimating matching affinity matrix under low-rank constraints. arXiv:1612.09585 [stat] (2016)
Dvurechensky, P., Gasnikov, A., Kroshnin, A.: Computational Optimal Transport: Complexity by Accelerated Gradient Descent Is Better Than by Sinkhorn’s Algorithm. arXiv:1802.04367 [cs, math] (2018)
Frogner, C., Zhang, C., Mobahi, H., Araya, M., Poggio, T.A.: Learning with a Wasserstein Loss. In: Advances in Neural Information Processing Systems p. 9 (2015)
Genevay, A., Peyré, G., Cuturi, M.: Learning Generative Models with Sinkhorn Divergences. arXiv:1706.00292 [stat] (2017)
Gerber, S., Maggioni, M.: Multiscale strategies for computing optimal transport. arXiv preprint arXiv:1708.02469 (2017)
Griewank, A.: Who Invented the Reverse Mode of Differentiation? Documenta Mathematica, p. 12 (2012)
Griewank, A., Walther, A.: Evaluating Derivatives. Other Titles in Applied Mathematics. Society for Industrial and Applied Mathematics (2008). https://doi.org/10.1137/1.9780898717761
Huang, G., Guo, C., Kusner, M.J., Sun, Y., Sha, F., Weinberger, K.Q.: Supervised word mover’s distance. In: Advances in Neural Information Processing Systems, pp. 4862–4870 (2016)
Kedem, D., Tyree, S., Sha, F., Lanckriet, G.R., Weinberger, K.Q.: Non-linear Metric Learning. Neural Information Processing Systems (NIPS) p. 9 (2012)
Kulis, B.: Metric Learning: A Survey. Foundations and Trends® in Machine Learning 5(4), 287–364 (2013). https://doi.org/10.1561/2200000019
Lévy, B.: A numerical algorithm for L2 semi-discrete optimal transport in 3D. ESAIM: Math. Modell. Numer. Anal. 49(6), 1693–1715 (2015). https://doi.org/10.1051/m2an/201505510.1051/m2an/2015055
Article MATH Google Scholar
Li, R., Ye, X., Zhou, H., Zha, H.: Learning to Match via Inverse Optimal Transport, p. 37 (2019)
MacAdam, D.L.: Visual sensitivities to color differences in daylight. J. Opt. Soc. Am. 32(5), 247 (1942). https://doi.org/10.1364/JOSA.32.000247
Article Google Scholar
McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128(1), 153–179 (1997). https://doi.org/10.1006/aima.1997.1634
Article MathSciNet MATH Google Scholar
Mirebeau, J.M., Dreo, J.: Automatic differentiation of non-holonomic fast marching for computing most threatening trajectories under sensors surveillance. arXiv:1704.03782 [math] (2017)
Papadakis, N., Peyré, G., Oudet, E.: Optimal transport with proximal splitting. SIAM J. Imaging Sci. 7(1), 212–238 (2014). https://doi.org/10.1137/130920058
Article MathSciNet MATH Google Scholar
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic differentiation in PyTorch p. 4 (2017)
Pele, O., Ben-Aliz, Y.: Interpolated Discretized Embedding of Single Vectors and Vector Pairs for Classification, Metric Learning and Distance Approximation. arXiv:1608.02484 [cs] (2016)
Peyré, G., Cuturi, M.: Computational Optimal Transport. Now Publishers, Inc, Boston (2018)
MATH Google Scholar
Rubner, Y., Tomasi, C., Guibas, L.J.: The earth Mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)
Article Google Scholar
Sandler, R., Lindenbaum, M.: Nonnegative matrix factorization with earth Mover’s distance metric for image analysis. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1590–1602 (2011). https://doi.org/10.1109/TPAMI.2011.18
Article Google Scholar
Santambrogio, F.: Optimal Transport for Applied Mathematicians. Springer, Berlin (2015)
Book Google Scholar
Schiebinger, G., Shu, J., Tabaka, M., Cleary, B., Subramanian, V., Solomon, A., Gould, J., Liu, S., Lin, S., Berube, P., Lee, L., Chen, J., Brumbaugh, J., Rigollet, P., Hochedlinger, K., Jaenisch, R., Regev, A., Lander, E.S.: Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming. Cell 176(4), 928–943.e22 (2019). https://doi.org/10.1016/j.cell.2019.01.006
Article Google Scholar
Schmitz, M.A., Heitz, M., Bonneel, N., Ngolè Mboula, F.M., Coeurjolly, D., Cuturi, M., Peyré, G., Starck, J.L.: Wasserstein dictionary learning: optimal transport-based unsupervised non-linear dictionary learning. SIAM J. Imaging Sci. 11(1), 643–678 (2018)
Article MathSciNet Google Scholar
Simou, E., Frossard, P.: Graph Signal Representation with Wasserstein Barycenters. arXiv:1812.05517 [eess] (2018)
Solomon, J., de Goes, F., Peyré, G., Cuturi, M., Butscher, A., Nguyen, A., Du, T., Guibas, L.: Convolutional wasserstein distances: efficient optimal transportation on geometric domains. ACM Trans. Graph. 34(4), 66:1–66:11 (2015). https://doi.org/10.1145/2766963
Article MATH Google Scholar
Stuart, A.M., Wolfram, M.T.: Inverse optimal transport. arXiv:1905.03950 [math, stat] (2019)
Torresani, L., Lee, K.c.: Large Margin Component Analysis. Advances in neural information processing systems p. 8 (2007)
Varadhan, S.R.S.: On the behavior of the fundamental solution of the heat equation with variable coefficients. Commun. Pure Appl. Math. 20(2), 431–455 (1967)
Article MathSciNet Google Scholar
Wang, F., Guibas, L.J.: Supervised Earth Mover’s Distance Learning and Its Computer Vision Applications. In: Computer Vision – ECCV 2012, vol. 7572, pp. 442–455. Springer Berlin Heidelberg, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33718-5_32
Wang, J., Do, H.T., Woznica, A., Kalousis, A.: Metric Learning with Multiple Kernels. In: Advances in Neural Information Processing Systems, p. 9 (2011)
Weinberger, K.Q., Blitzer, J., Saul, L.K.: Distance Metric Learning for Large Margin Nearest Neighbor Classification. In: Advances in neural information processing systems, p. 8 (2006)
Weinberger, K.Q., Saul, L.K.: Fast solvers and efficient implementations for distance metric learning. In: Proceedings of the 25th International Conference on Machine Learning—ICML ’08, pp. 1160–1167. ACM Press, Helsinki, Finland (2008). https://doi.org/10.1145/1390156.1390302
Xing, E.P., Jordan, M.I., Russell, S.J., Ng, A.Y.: Distance Metric Learning with Application to Clustering with Side-Information. In: Advances in Neural Information Processing Systems, p. 8 (2003)
Xu, J., Luo, L., Deng, C., Huang, H.: Multi-Level Metric Learning via Smoothed Wasserstein Distance. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, pp. 2919–2925. International Joint Conferences on Artificial Intelligence Organization, Stockholm, Sweden (2018). https://doi.org/10.24963/ijcai.2018/405
Yang, F., Cohen, L.D.: Geodesic distance and curves through isotropic and anisotropic heat equations on images and surfaces. J. Math. Imaging Vis. 55(2), 210–228 (2016). https://doi.org/10.1007/s10851-015-0621-9
Article MathSciNet MATH Google Scholar
Yang, W., Xu, L., Chen, X., Zheng, F., Liu, Y.: Chi-squared distance metric learning for histogram data. Math Problems Eng. 2015, 1–12 (2015). https://doi.org/10.1155/2015/352849
Article Google Scholar
Zen, G., Ricci, E., Sebe, N.: Simultaneous Ground Metric Learning and Matrix Factorization with Earth Mover’s Distance. In: 2014 22nd International Conference on Pattern Recognition, pp. 3690–3695 (2014). https://doi.org/10.1109/ICPR.2014.634

Download references

Acknowledgements

This work was supported by the French National Research Agency (ANR) through the ROOT research grant (ANR-16-CE23-0009). The work of G. Peyré was supported by the European Research Council (ERC project NORIA) and by the French government under management of Agence Nationale de la Recherche as part of the “Investissements d’avenir” program, reference ANR19-P3IA-0001 (PRAIRIE 3IA Institute).

Author information

Authors and Affiliations

CNRS/LIRIS, Université de Lyon, Lyon, France
Matthieu Heitz, Nicolas Bonneel & David Coeurjolly
Google Research, Brain team, Paris, France
Marco Cuturi
CNRS and ENS, PSL University, Paris, France
Gabriel Peyré

Authors

Matthieu Heitz
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Bonneel
View author publications
You can also search for this author in PubMed Google Scholar
David Coeurjolly
View author publications
You can also search for this author in PubMed Google Scholar
Marco Cuturi
View author publications
You can also search for this author in PubMed Google Scholar
Gabriel Peyré
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Matthieu Heitz or David Coeurjolly.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Elements of proof for Proposition 1

The mapping $\phi _1 : \mathbf {w} \in \mathbb {R}^K \rightarrow {\varvec{M}} \in \mathbb {R}^{N^2}$ admits as adjoint Jacobian:

$$\begin{aligned} \left[ \partial \phi _1(\mathbf {w})\right] ^T({\varvec{X}})_{i,j} = -\frac{\varepsilon }{4S}\left( {\varvec{X}}_{i,j}+{\varvec{X}}_{j,i}-{\varvec{X}}_{i,i}-{\varvec{X}}_{j,j}\right) . \end{aligned}$$

(15)

The mapping $\phi _2 : {\varvec{M}} \in \mathbb {R}^{N^2} \rightarrow {\varvec{U}}={\varvec{M}}^{-1}\in \mathbb {R}^{N^2}$ admits as adjoint Jacobian:

$$\begin{aligned} \left[ \partial \phi _2({\varvec{M}})\right] ^T({\varvec{X}}) = -{\varvec{M}}^{-1}{\varvec{X}}{\varvec{M}}^{-1}. \end{aligned}$$

(16)

The mapping $\phi _3 : {\varvec{U}} \in \mathbb {R}^{N^2} \rightarrow {\varvec{V}}={\varvec{U}}^{S} \in \mathbb {R}^{N^2}$ admits as adjoint Jacobian:

$$\begin{aligned} \left[ \partial \phi _3({\varvec{U}})\right] ^T({\varvec{X}}) = \sum _{l=0}^{S-1}{\varvec{U}}^{l}{\varvec{X}}{\varvec{U}}^{S-l-1}. \end{aligned}$$

(17)

The mapping $\phi _4 : {\varvec{V}} \in \mathbb {R}^{N^2} \rightarrow \mathbf {y}={\varvec{V}}\mathbf {v} \in \mathbb {R}^{N}$ admits as adjoint Jacobian:

$$\begin{aligned} \left[ \partial \phi _4({\varvec{V}})\right] ^T(\mathbf {x}) = \mathbf {x}\mathbf {v}^T. \end{aligned}$$

(18)

Since $\Phi = \phi _4 \circ \phi _3 \circ \phi _2 \circ \phi _1$, we compose the adjoint Jacobians in the reverse order as follows:

$$\begin{aligned} \left[ \partial \Phi (\mathbf {w})\right] ^T({\varvec{g}})_{i,j} = \left[ \partial \phi _1\right] ^T\left[ \partial \phi _2\right] ^T\left[ \partial \phi _3\right] ^T\left[ \partial \phi _4\right] ^T(\mathbf {g})_{i,j}, \end{aligned}$$

(19)

to finally obtain:

$$\begin{aligned}&{[}\partial \Phi (\mathbf {w})]^T(\mathbf {g})_{i,j} = - \frac{\varepsilon }{4S}\sum _{\ell =0}^{S-1} \left( \mathbf {g}_i^\ell - \mathbf {g}_j^\ell \right) \left( \mathbf {v}_i^\ell - \mathbf {v}_j^\ell \right) , \end{aligned}$$

(20)

$$\begin{aligned}&\text {where}\quad \left\{ \begin{array}{l} \mathbf {g}^\ell {\mathop {=}\limits ^{\,\mathrm{def.}}}\,{\varvec{M}}^{\ell -S}\mathbf {g} \\ \mathbf {v}^\ell {\mathop {=}\limits ^{\,\mathrm{def.}}}\,{\varvec{M}}^{-\ell -1} \mathbf {v}. \end{array} \right. \end{aligned}$$

(21)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Heitz, M., Bonneel, N., Coeurjolly, D. et al. Ground Metric Learning on Graphs. J Math Imaging Vis 63, 89–107 (2021). https://doi.org/10.1007/s10851-020-00996-z

Download citation

Received: 07 November 2019
Accepted: 12 October 2020
Published: 30 October 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s10851-020-00996-z

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learned Approximate Distance Labels for Graphs

High-Dimensional Approximate r-Nets

Learning the Graph Edit Costs: What Do We Want to Optimise?

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

A Elements of proof for Proposition 1

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Ground Metric Learning on Graphs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learned Approximate Distance Labels for Graphs

High-Dimensional Approximate r-Nets

Learning the Graph Edit Costs: What Do We Want to Optimise?

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

A Elements of proof for Proposition 1

A Elements of proof for Proposition 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now