Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Ground Metric Learning on Graphs

  • Published:
Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Abstract

Optimal transport (OT) distances between probability distributions are parameterized by the ground metric they use between observations. Their relevance for real-life applications strongly hinges on whether that ground metric parameter is suitably chosen. The challenge of selecting it adaptively and algorithmically from prior knowledge, the so-called ground metric learning (GML) problem, has therefore appeared in various settings. In this paper, we consider the GML problem when the learned metric is constrained to be a geodesic distance on a graph that supports the measures of interest. This imposes a rich structure for candidate metrics, but also enables far more efficient learning procedures when compared to a direct optimization over the space of all metric matrices. We use this setting to tackle an inverse problem stemming from the observation of a density evolving with time; we seek a graph ground metric such that the OT interpolation between the starting and ending densities that result from that ground metric agrees with the observed evolution. This OT dynamic framework is relevant to model natural phenomena exhibiting displacements of mass, such as the evolution of the color palette induced by the modification of lighting and materials.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Notes

  1. https://github.com/matthieuheitz/2020-JMIV-ground-metric-learning-graphs

References

  1. Agueh, M., Carlier, G.: Barycenters in the Wasserstein Space. SIAM J. Math. Anal. 43(2), 904–924 (2011). https://doi.org/10.1137/100805741

    Article  MathSciNet  MATH  Google Scholar 

  2. Altschuler, J., Bach, F., Rudi, A., Weed, J.: Massively scalable Sinkhorn distances via the Nyström method. arXiv:1812.05189 [cs, math, stat] (2018)

  3. Altschuler, J., Weed, J., Rigollet, P.: Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. arXiv preprint arXiv:1705.09634 (2017)

  4. Angenent, S., Haker, S., Tannenbaum, A.: Minimizing flows for the Monge–Kantorovich problem. SIAM J. Math. Anal. 35(1), 61–97 (2003). https://doi.org/10.1137/S0036141002410927

    Article  MathSciNet  MATH  Google Scholar 

  5. Bellet, A., Habrard, A., Sebban, M.: Metric Learning. Synthesis Digital Library of Engineering and Computer Science. San Rafael, California (1537 Fourth Street, San Rafael, CA 94901 USA): Morgan & Claypool (2015)

  6. Benamou, J.D., Carlier, G., Cuturi, M., Nenna, L., Peyré, G.: Iterative Bregman projections for regularized transportation problems. SIAM J. Sci. Comput. 37(2), A1111–A1138 (2015)

    Article  MathSciNet  Google Scholar 

  7. Benmansour, F., Carlier, G., Peyré, G., Santambrogio, F.: Derivatives with respect to metrics and applications: subgradient marching algorithm. Numerische Mathematik 116(3), 357–381 (2010). https://doi.org/10.1007/s00211-010-0305-8

    Article  MathSciNet  MATH  Google Scholar 

  8. Bonneel, N., Peyré, G., Cuturi, M.: Wasserstein barycentric coordinates: Histogram regression using optimal transport. ACM Trans. Graph. 35(4), 1–10 (2016). https://doi.org/10.1145/2897824.2925918

    Article  Google Scholar 

  9. Brickell, J., Dhillon, I.S., Sra, S., Tropp, J.A.: The metric nearness problem. SIAM J. Matrix Anal. Appl. 30(1), 375–396 (2008). https://doi.org/10.1137/060653391

    Article  MathSciNet  MATH  Google Scholar 

  10. Buttazzo, G., Davini, A., Fragalà, I., Macià, F.: Optimal Riemannian distances preventing mass transfer. Journal für die reine und angewandte Mathematik (Crelles Journal) (2004). https://doi.org/10.1515/crll.2004.077

    Article  MathSciNet  MATH  Google Scholar 

  11. Chechik, G., Shalit, U., Sharma, V., Bengio, S.: An Online Algorithm for Large Scale Image Similarity Learning. In: Advances in Neural Information Processing Systems p. 9 (2009)

  12. Chizat, L., Peyré, G., Schmitzer, B., Vialard, F.X.: Scaling algorithms for unbalanced transport problems. arXiv:1607.05816 (2016)

  13. Chopra, S., Hadsell, R., LeCun, Y.: Learning a Similarity Metric Discriminatively, with Application to Face Verification. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 539–546. IEEE, San Diego, CA, USA (2005). https://doi.org/10.1109/CVPR.2005.202

  14. Courty, N., Flamary, R., Tuia, D., Rakotomamonjy, A.: Optimal transport for domain adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 39(9), 1853–1865 (2016). https://doi.org/10.1109/TPAMI.2016.2615921

    Article  Google Scholar 

  15. Crane, K., Weischedel, C., Wardetzky, M.: Geodesics in heat: a new approach to computing distance based on heat flow. ACM Trans. Graph. (TOG) 32(5), 152 (2013)

    Article  Google Scholar 

  16. Cuturi, M.: Sinkhorn distances: Lightspeed computation of optimal transport. In: Advances in Neural Information Processing Systems, pp. 2292–2300 (2013)

  17. Cuturi, M., Avis, D.: Ground metric learning. J. Mach. Learn. Res. 15(1), 533–564 (2014)

    MathSciNet  MATH  Google Scholar 

  18. Cuturi, M., Doucet, A.: Fast computation of Wasserstein barycenters. In: International Conference on Machine Learning, pp. 685–693 (2014)

  19. Dognin, P., Melnyk, I., Mroueh, Y., Ross, J., Santos, C.D., Sercu, T.: Wasserstein Barycenter Model Ensembling. arXiv:1902.04999 [cs, stat] (2019)

  20. Dupuy, A., Galichon, A., Sun, Y.: Estimating matching affinity matrix under low-rank constraints. arXiv:1612.09585 [stat] (2016)

  21. Dvurechensky, P., Gasnikov, A., Kroshnin, A.: Computational Optimal Transport: Complexity by Accelerated Gradient Descent Is Better Than by Sinkhorn’s Algorithm. arXiv:1802.04367 [cs, math] (2018)

  22. Frogner, C., Zhang, C., Mobahi, H., Araya, M., Poggio, T.A.: Learning with a Wasserstein Loss. In: Advances in Neural Information Processing Systems p. 9 (2015)

  23. Genevay, A., Peyré, G., Cuturi, M.: Learning Generative Models with Sinkhorn Divergences. arXiv:1706.00292 [stat] (2017)

  24. Gerber, S., Maggioni, M.: Multiscale strategies for computing optimal transport. arXiv preprint arXiv:1708.02469 (2017)

  25. Griewank, A.: Who Invented the Reverse Mode of Differentiation? Documenta Mathematica, p. 12 (2012)

  26. Griewank, A., Walther, A.: Evaluating Derivatives. Other Titles in Applied Mathematics. Society for Industrial and Applied Mathematics (2008). https://doi.org/10.1137/1.9780898717761

  27. Huang, G., Guo, C., Kusner, M.J., Sun, Y., Sha, F., Weinberger, K.Q.: Supervised word mover’s distance. In: Advances in Neural Information Processing Systems, pp. 4862–4870 (2016)

  28. Kedem, D., Tyree, S., Sha, F., Lanckriet, G.R., Weinberger, K.Q.: Non-linear Metric Learning. Neural Information Processing Systems (NIPS) p. 9 (2012)

  29. Kulis, B.: Metric Learning: A Survey. Foundations and Trends® in Machine Learning 5(4), 287–364 (2013). https://doi.org/10.1561/2200000019

  30. Lévy, B.: A numerical algorithm for L2 semi-discrete optimal transport in 3D. ESAIM: Math. Modell. Numer. Anal. 49(6), 1693–1715 (2015). https://doi.org/10.1051/m2an/201505510.1051/m2an/2015055

    Article  MATH  Google Scholar 

  31. Li, R., Ye, X., Zhou, H., Zha, H.: Learning to Match via Inverse Optimal Transport, p. 37 (2019)

  32. MacAdam, D.L.: Visual sensitivities to color differences in daylight. J. Opt. Soc. Am. 32(5), 247 (1942). https://doi.org/10.1364/JOSA.32.000247

    Article  Google Scholar 

  33. McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128(1), 153–179 (1997). https://doi.org/10.1006/aima.1997.1634

    Article  MathSciNet  MATH  Google Scholar 

  34. Mirebeau, J.M., Dreo, J.: Automatic differentiation of non-holonomic fast marching for computing most threatening trajectories under sensors surveillance. arXiv:1704.03782 [math] (2017)

  35. Papadakis, N., Peyré, G., Oudet, E.: Optimal transport with proximal splitting. SIAM J. Imaging Sci. 7(1), 212–238 (2014). https://doi.org/10.1137/130920058

    Article  MathSciNet  MATH  Google Scholar 

  36. Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic differentiation in PyTorch p. 4 (2017)

  37. Pele, O., Ben-Aliz, Y.: Interpolated Discretized Embedding of Single Vectors and Vector Pairs for Classification, Metric Learning and Distance Approximation. arXiv:1608.02484 [cs] (2016)

  38. Peyré, G., Cuturi, M.: Computational Optimal Transport. Now Publishers, Inc, Boston (2018)

    MATH  Google Scholar 

  39. Rubner, Y., Tomasi, C., Guibas, L.J.: The earth Mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)

    Article  Google Scholar 

  40. Sandler, R., Lindenbaum, M.: Nonnegative matrix factorization with earth Mover’s distance metric for image analysis. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1590–1602 (2011). https://doi.org/10.1109/TPAMI.2011.18

    Article  Google Scholar 

  41. Santambrogio, F.: Optimal Transport for Applied Mathematicians. Springer, Berlin (2015)

    Book  Google Scholar 

  42. Schiebinger, G., Shu, J., Tabaka, M., Cleary, B., Subramanian, V., Solomon, A., Gould, J., Liu, S., Lin, S., Berube, P., Lee, L., Chen, J., Brumbaugh, J., Rigollet, P., Hochedlinger, K., Jaenisch, R., Regev, A., Lander, E.S.: Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming. Cell 176(4), 928–943.e22 (2019). https://doi.org/10.1016/j.cell.2019.01.006

    Article  Google Scholar 

  43. Schmitz, M.A., Heitz, M., Bonneel, N., Ngolè Mboula, F.M., Coeurjolly, D., Cuturi, M., Peyré, G., Starck, J.L.: Wasserstein dictionary learning: optimal transport-based unsupervised non-linear dictionary learning. SIAM J. Imaging Sci. 11(1), 643–678 (2018)

    Article  MathSciNet  Google Scholar 

  44. Simou, E., Frossard, P.: Graph Signal Representation with Wasserstein Barycenters. arXiv:1812.05517 [eess] (2018)

  45. Solomon, J., de Goes, F., Peyré, G., Cuturi, M., Butscher, A., Nguyen, A., Du, T., Guibas, L.: Convolutional wasserstein distances: efficient optimal transportation on geometric domains. ACM Trans. Graph. 34(4), 66:1–66:11 (2015). https://doi.org/10.1145/2766963

    Article  MATH  Google Scholar 

  46. Stuart, A.M., Wolfram, M.T.: Inverse optimal transport. arXiv:1905.03950 [math, stat] (2019)

  47. Torresani, L., Lee, K.c.: Large Margin Component Analysis. Advances in neural information processing systems p. 8 (2007)

  48. Varadhan, S.R.S.: On the behavior of the fundamental solution of the heat equation with variable coefficients. Commun. Pure Appl. Math. 20(2), 431–455 (1967)

    Article  MathSciNet  Google Scholar 

  49. Wang, F., Guibas, L.J.: Supervised Earth Mover’s Distance Learning and Its Computer Vision Applications. In: Computer Vision – ECCV 2012, vol. 7572, pp. 442–455. Springer Berlin Heidelberg, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33718-5_32

  50. Wang, J., Do, H.T., Woznica, A., Kalousis, A.: Metric Learning with Multiple Kernels. In: Advances in Neural Information Processing Systems, p. 9 (2011)

  51. Weinberger, K.Q., Blitzer, J., Saul, L.K.: Distance Metric Learning for Large Margin Nearest Neighbor Classification. In: Advances in neural information processing systems, p. 8 (2006)

  52. Weinberger, K.Q., Saul, L.K.: Fast solvers and efficient implementations for distance metric learning. In: Proceedings of the 25th International Conference on Machine Learning—ICML ’08, pp. 1160–1167. ACM Press, Helsinki, Finland (2008). https://doi.org/10.1145/1390156.1390302

  53. Xing, E.P., Jordan, M.I., Russell, S.J., Ng, A.Y.: Distance Metric Learning with Application to Clustering with Side-Information. In: Advances in Neural Information Processing Systems, p. 8 (2003)

  54. Xu, J., Luo, L., Deng, C., Huang, H.: Multi-Level Metric Learning via Smoothed Wasserstein Distance. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, pp. 2919–2925. International Joint Conferences on Artificial Intelligence Organization, Stockholm, Sweden (2018). https://doi.org/10.24963/ijcai.2018/405

  55. Yang, F., Cohen, L.D.: Geodesic distance and curves through isotropic and anisotropic heat equations on images and surfaces. J. Math. Imaging Vis. 55(2), 210–228 (2016). https://doi.org/10.1007/s10851-015-0621-9

    Article  MathSciNet  MATH  Google Scholar 

  56. Yang, W., Xu, L., Chen, X., Zheng, F., Liu, Y.: Chi-squared distance metric learning for histogram data. Math Problems Eng. 2015, 1–12 (2015). https://doi.org/10.1155/2015/352849

    Article  Google Scholar 

  57. Zen, G., Ricci, E., Sebe, N.: Simultaneous Ground Metric Learning and Matrix Factorization with Earth Mover’s Distance. In: 2014 22nd International Conference on Pattern Recognition, pp. 3690–3695 (2014). https://doi.org/10.1109/ICPR.2014.634

Download references

Acknowledgements

This work was supported by the French National Research Agency (ANR) through the ROOT research grant (ANR-16-CE23-0009). The work of G. Peyré was supported by the European Research Council (ERC project NORIA) and by the French government under management of Agence Nationale de la Recherche as part of the “Investissements d’avenir” program, reference ANR19-P3IA-0001 (PRAIRIE 3IA Institute).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Matthieu Heitz or David Coeurjolly.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Elements of proof for Proposition 1

A Elements of proof for Proposition 1

The mapping \(\phi _1 : \mathbf {w} \in \mathbb {R}^K \rightarrow {\varvec{M}} \in \mathbb {R}^{N^2}\) admits as adjoint Jacobian:

$$\begin{aligned} \left[ \partial \phi _1(\mathbf {w})\right] ^T({\varvec{X}})_{i,j} = -\frac{\varepsilon }{4S}\left( {\varvec{X}}_{i,j}+{\varvec{X}}_{j,i}-{\varvec{X}}_{i,i}-{\varvec{X}}_{j,j}\right) . \end{aligned}$$
(15)

The mapping \(\phi _2 : {\varvec{M}} \in \mathbb {R}^{N^2} \rightarrow {\varvec{U}}={\varvec{M}}^{-1}\in \mathbb {R}^{N^2}\) admits as adjoint Jacobian:

$$\begin{aligned} \left[ \partial \phi _2({\varvec{M}})\right] ^T({\varvec{X}}) = -{\varvec{M}}^{-1}{\varvec{X}}{\varvec{M}}^{-1}. \end{aligned}$$
(16)

The mapping \(\phi _3 : {\varvec{U}} \in \mathbb {R}^{N^2} \rightarrow {\varvec{V}}={\varvec{U}}^{S} \in \mathbb {R}^{N^2}\) admits as adjoint Jacobian:

$$\begin{aligned} \left[ \partial \phi _3({\varvec{U}})\right] ^T({\varvec{X}}) = \sum _{l=0}^{S-1}{\varvec{U}}^{l}{\varvec{X}}{\varvec{U}}^{S-l-1}. \end{aligned}$$
(17)

The mapping \(\phi _4 : {\varvec{V}} \in \mathbb {R}^{N^2} \rightarrow \mathbf {y}={\varvec{V}}\mathbf {v} \in \mathbb {R}^{N}\) admits as adjoint Jacobian:

$$\begin{aligned} \left[ \partial \phi _4({\varvec{V}})\right] ^T(\mathbf {x}) = \mathbf {x}\mathbf {v}^T. \end{aligned}$$
(18)

Since \(\Phi = \phi _4 \circ \phi _3 \circ \phi _2 \circ \phi _1\), we compose the adjoint Jacobians in the reverse order as follows:

$$\begin{aligned} \left[ \partial \Phi (\mathbf {w})\right] ^T({\varvec{g}})_{i,j} = \left[ \partial \phi _1\right] ^T\left[ \partial \phi _2\right] ^T\left[ \partial \phi _3\right] ^T\left[ \partial \phi _4\right] ^T(\mathbf {g})_{i,j}, \end{aligned}$$
(19)

to finally obtain:

$$\begin{aligned}&{[}\partial \Phi (\mathbf {w})]^T(\mathbf {g})_{i,j} = - \frac{\varepsilon }{4S}\sum _{\ell =0}^{S-1} \left( \mathbf {g}_i^\ell - \mathbf {g}_j^\ell \right) \left( \mathbf {v}_i^\ell - \mathbf {v}_j^\ell \right) , \end{aligned}$$
(20)
$$\begin{aligned}&\text {where}\quad \left\{ \begin{array}{l} \mathbf {g}^\ell {\mathop {=}\limits ^{\,\mathrm{def.}}}\,{\varvec{M}}^{\ell -S}\mathbf {g} \\ \mathbf {v}^\ell {\mathop {=}\limits ^{\,\mathrm{def.}}}\,{\varvec{M}}^{-\ell -1} \mathbf {v}. \end{array} \right. \end{aligned}$$
(21)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Heitz, M., Bonneel, N., Coeurjolly, D. et al. Ground Metric Learning on Graphs. J Math Imaging Vis 63, 89–107 (2021). https://doi.org/10.1007/s10851-020-00996-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10851-020-00996-z

Keywords

Navigation