Abstract
Optimal transport (OT) distances between probability distributions are parameterized by the ground metric they use between observations. Their relevance for real-life applications strongly hinges on whether that ground metric parameter is suitably chosen. The challenge of selecting it adaptively and algorithmically from prior knowledge, the so-called ground metric learning (GML) problem, has therefore appeared in various settings. In this paper, we consider the GML problem when the learned metric is constrained to be a geodesic distance on a graph that supports the measures of interest. This imposes a rich structure for candidate metrics, but also enables far more efficient learning procedures when compared to a direct optimization over the space of all metric matrices. We use this setting to tackle an inverse problem stemming from the observation of a density evolving with time; we seek a graph ground metric such that the OT interpolation between the starting and ending densities that result from that ground metric agrees with the observed evolution. This OT dynamic framework is relevant to model natural phenomena exhibiting displacements of mass, such as the evolution of the color palette induced by the modification of lighting and materials.
Similar content being viewed by others
References
Agueh, M., Carlier, G.: Barycenters in the Wasserstein Space. SIAM J. Math. Anal. 43(2), 904–924 (2011). https://doi.org/10.1137/100805741
Altschuler, J., Bach, F., Rudi, A., Weed, J.: Massively scalable Sinkhorn distances via the Nyström method. arXiv:1812.05189 [cs, math, stat] (2018)
Altschuler, J., Weed, J., Rigollet, P.: Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. arXiv preprint arXiv:1705.09634 (2017)
Angenent, S., Haker, S., Tannenbaum, A.: Minimizing flows for the Monge–Kantorovich problem. SIAM J. Math. Anal. 35(1), 61–97 (2003). https://doi.org/10.1137/S0036141002410927
Bellet, A., Habrard, A., Sebban, M.: Metric Learning. Synthesis Digital Library of Engineering and Computer Science. San Rafael, California (1537 Fourth Street, San Rafael, CA 94901 USA): Morgan & Claypool (2015)
Benamou, J.D., Carlier, G., Cuturi, M., Nenna, L., Peyré, G.: Iterative Bregman projections for regularized transportation problems. SIAM J. Sci. Comput. 37(2), A1111–A1138 (2015)
Benmansour, F., Carlier, G., Peyré, G., Santambrogio, F.: Derivatives with respect to metrics and applications: subgradient marching algorithm. Numerische Mathematik 116(3), 357–381 (2010). https://doi.org/10.1007/s00211-010-0305-8
Bonneel, N., Peyré, G., Cuturi, M.: Wasserstein barycentric coordinates: Histogram regression using optimal transport. ACM Trans. Graph. 35(4), 1–10 (2016). https://doi.org/10.1145/2897824.2925918
Brickell, J., Dhillon, I.S., Sra, S., Tropp, J.A.: The metric nearness problem. SIAM J. Matrix Anal. Appl. 30(1), 375–396 (2008). https://doi.org/10.1137/060653391
Buttazzo, G., Davini, A., Fragalà, I., Macià, F.: Optimal Riemannian distances preventing mass transfer. Journal für die reine und angewandte Mathematik (Crelles Journal) (2004). https://doi.org/10.1515/crll.2004.077
Chechik, G., Shalit, U., Sharma, V., Bengio, S.: An Online Algorithm for Large Scale Image Similarity Learning. In: Advances in Neural Information Processing Systems p. 9 (2009)
Chizat, L., Peyré, G., Schmitzer, B., Vialard, F.X.: Scaling algorithms for unbalanced transport problems. arXiv:1607.05816 (2016)
Chopra, S., Hadsell, R., LeCun, Y.: Learning a Similarity Metric Discriminatively, with Application to Face Verification. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 539–546. IEEE, San Diego, CA, USA (2005). https://doi.org/10.1109/CVPR.2005.202
Courty, N., Flamary, R., Tuia, D., Rakotomamonjy, A.: Optimal transport for domain adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 39(9), 1853–1865 (2016). https://doi.org/10.1109/TPAMI.2016.2615921
Crane, K., Weischedel, C., Wardetzky, M.: Geodesics in heat: a new approach to computing distance based on heat flow. ACM Trans. Graph. (TOG) 32(5), 152 (2013)
Cuturi, M.: Sinkhorn distances: Lightspeed computation of optimal transport. In: Advances in Neural Information Processing Systems, pp. 2292–2300 (2013)
Cuturi, M., Avis, D.: Ground metric learning. J. Mach. Learn. Res. 15(1), 533–564 (2014)
Cuturi, M., Doucet, A.: Fast computation of Wasserstein barycenters. In: International Conference on Machine Learning, pp. 685–693 (2014)
Dognin, P., Melnyk, I., Mroueh, Y., Ross, J., Santos, C.D., Sercu, T.: Wasserstein Barycenter Model Ensembling. arXiv:1902.04999 [cs, stat] (2019)
Dupuy, A., Galichon, A., Sun, Y.: Estimating matching affinity matrix under low-rank constraints. arXiv:1612.09585 [stat] (2016)
Dvurechensky, P., Gasnikov, A., Kroshnin, A.: Computational Optimal Transport: Complexity by Accelerated Gradient Descent Is Better Than by Sinkhorn’s Algorithm. arXiv:1802.04367 [cs, math] (2018)
Frogner, C., Zhang, C., Mobahi, H., Araya, M., Poggio, T.A.: Learning with a Wasserstein Loss. In: Advances in Neural Information Processing Systems p. 9 (2015)
Genevay, A., Peyré, G., Cuturi, M.: Learning Generative Models with Sinkhorn Divergences. arXiv:1706.00292 [stat] (2017)
Gerber, S., Maggioni, M.: Multiscale strategies for computing optimal transport. arXiv preprint arXiv:1708.02469 (2017)
Griewank, A.: Who Invented the Reverse Mode of Differentiation? Documenta Mathematica, p. 12 (2012)
Griewank, A., Walther, A.: Evaluating Derivatives. Other Titles in Applied Mathematics. Society for Industrial and Applied Mathematics (2008). https://doi.org/10.1137/1.9780898717761
Huang, G., Guo, C., Kusner, M.J., Sun, Y., Sha, F., Weinberger, K.Q.: Supervised word mover’s distance. In: Advances in Neural Information Processing Systems, pp. 4862–4870 (2016)
Kedem, D., Tyree, S., Sha, F., Lanckriet, G.R., Weinberger, K.Q.: Non-linear Metric Learning. Neural Information Processing Systems (NIPS) p. 9 (2012)
Kulis, B.: Metric Learning: A Survey. Foundations and Trends® in Machine Learning 5(4), 287–364 (2013). https://doi.org/10.1561/2200000019
Lévy, B.: A numerical algorithm for L2 semi-discrete optimal transport in 3D. ESAIM: Math. Modell. Numer. Anal. 49(6), 1693–1715 (2015). https://doi.org/10.1051/m2an/201505510.1051/m2an/2015055
Li, R., Ye, X., Zhou, H., Zha, H.: Learning to Match via Inverse Optimal Transport, p. 37 (2019)
MacAdam, D.L.: Visual sensitivities to color differences in daylight. J. Opt. Soc. Am. 32(5), 247 (1942). https://doi.org/10.1364/JOSA.32.000247
McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128(1), 153–179 (1997). https://doi.org/10.1006/aima.1997.1634
Mirebeau, J.M., Dreo, J.: Automatic differentiation of non-holonomic fast marching for computing most threatening trajectories under sensors surveillance. arXiv:1704.03782 [math] (2017)
Papadakis, N., Peyré, G., Oudet, E.: Optimal transport with proximal splitting. SIAM J. Imaging Sci. 7(1), 212–238 (2014). https://doi.org/10.1137/130920058
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic differentiation in PyTorch p. 4 (2017)
Pele, O., Ben-Aliz, Y.: Interpolated Discretized Embedding of Single Vectors and Vector Pairs for Classification, Metric Learning and Distance Approximation. arXiv:1608.02484 [cs] (2016)
Peyré, G., Cuturi, M.: Computational Optimal Transport. Now Publishers, Inc, Boston (2018)
Rubner, Y., Tomasi, C., Guibas, L.J.: The earth Mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)
Sandler, R., Lindenbaum, M.: Nonnegative matrix factorization with earth Mover’s distance metric for image analysis. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1590–1602 (2011). https://doi.org/10.1109/TPAMI.2011.18
Santambrogio, F.: Optimal Transport for Applied Mathematicians. Springer, Berlin (2015)
Schiebinger, G., Shu, J., Tabaka, M., Cleary, B., Subramanian, V., Solomon, A., Gould, J., Liu, S., Lin, S., Berube, P., Lee, L., Chen, J., Brumbaugh, J., Rigollet, P., Hochedlinger, K., Jaenisch, R., Regev, A., Lander, E.S.: Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming. Cell 176(4), 928–943.e22 (2019). https://doi.org/10.1016/j.cell.2019.01.006
Schmitz, M.A., Heitz, M., Bonneel, N., Ngolè Mboula, F.M., Coeurjolly, D., Cuturi, M., Peyré, G., Starck, J.L.: Wasserstein dictionary learning: optimal transport-based unsupervised non-linear dictionary learning. SIAM J. Imaging Sci. 11(1), 643–678 (2018)
Simou, E., Frossard, P.: Graph Signal Representation with Wasserstein Barycenters. arXiv:1812.05517 [eess] (2018)
Solomon, J., de Goes, F., Peyré, G., Cuturi, M., Butscher, A., Nguyen, A., Du, T., Guibas, L.: Convolutional wasserstein distances: efficient optimal transportation on geometric domains. ACM Trans. Graph. 34(4), 66:1–66:11 (2015). https://doi.org/10.1145/2766963
Stuart, A.M., Wolfram, M.T.: Inverse optimal transport. arXiv:1905.03950 [math, stat] (2019)
Torresani, L., Lee, K.c.: Large Margin Component Analysis. Advances in neural information processing systems p. 8 (2007)
Varadhan, S.R.S.: On the behavior of the fundamental solution of the heat equation with variable coefficients. Commun. Pure Appl. Math. 20(2), 431–455 (1967)
Wang, F., Guibas, L.J.: Supervised Earth Mover’s Distance Learning and Its Computer Vision Applications. In: Computer Vision – ECCV 2012, vol. 7572, pp. 442–455. Springer Berlin Heidelberg, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33718-5_32
Wang, J., Do, H.T., Woznica, A., Kalousis, A.: Metric Learning with Multiple Kernels. In: Advances in Neural Information Processing Systems, p. 9 (2011)
Weinberger, K.Q., Blitzer, J., Saul, L.K.: Distance Metric Learning for Large Margin Nearest Neighbor Classification. In: Advances in neural information processing systems, p. 8 (2006)
Weinberger, K.Q., Saul, L.K.: Fast solvers and efficient implementations for distance metric learning. In: Proceedings of the 25th International Conference on Machine Learning—ICML ’08, pp. 1160–1167. ACM Press, Helsinki, Finland (2008). https://doi.org/10.1145/1390156.1390302
Xing, E.P., Jordan, M.I., Russell, S.J., Ng, A.Y.: Distance Metric Learning with Application to Clustering with Side-Information. In: Advances in Neural Information Processing Systems, p. 8 (2003)
Xu, J., Luo, L., Deng, C., Huang, H.: Multi-Level Metric Learning via Smoothed Wasserstein Distance. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, pp. 2919–2925. International Joint Conferences on Artificial Intelligence Organization, Stockholm, Sweden (2018). https://doi.org/10.24963/ijcai.2018/405
Yang, F., Cohen, L.D.: Geodesic distance and curves through isotropic and anisotropic heat equations on images and surfaces. J. Math. Imaging Vis. 55(2), 210–228 (2016). https://doi.org/10.1007/s10851-015-0621-9
Yang, W., Xu, L., Chen, X., Zheng, F., Liu, Y.: Chi-squared distance metric learning for histogram data. Math Problems Eng. 2015, 1–12 (2015). https://doi.org/10.1155/2015/352849
Zen, G., Ricci, E., Sebe, N.: Simultaneous Ground Metric Learning and Matrix Factorization with Earth Mover’s Distance. In: 2014 22nd International Conference on Pattern Recognition, pp. 3690–3695 (2014). https://doi.org/10.1109/ICPR.2014.634
Acknowledgements
This work was supported by the French National Research Agency (ANR) through the ROOT research grant (ANR-16-CE23-0009). The work of G. Peyré was supported by the European Research Council (ERC project NORIA) and by the French government under management of Agence Nationale de la Recherche as part of the “Investissements d’avenir” program, reference ANR19-P3IA-0001 (PRAIRIE 3IA Institute).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A Elements of proof for Proposition 1
A Elements of proof for Proposition 1
The mapping \(\phi _1 : \mathbf {w} \in \mathbb {R}^K \rightarrow {\varvec{M}} \in \mathbb {R}^{N^2}\) admits as adjoint Jacobian:
The mapping \(\phi _2 : {\varvec{M}} \in \mathbb {R}^{N^2} \rightarrow {\varvec{U}}={\varvec{M}}^{-1}\in \mathbb {R}^{N^2}\) admits as adjoint Jacobian:
The mapping \(\phi _3 : {\varvec{U}} \in \mathbb {R}^{N^2} \rightarrow {\varvec{V}}={\varvec{U}}^{S} \in \mathbb {R}^{N^2}\) admits as adjoint Jacobian:
The mapping \(\phi _4 : {\varvec{V}} \in \mathbb {R}^{N^2} \rightarrow \mathbf {y}={\varvec{V}}\mathbf {v} \in \mathbb {R}^{N}\) admits as adjoint Jacobian:
Since \(\Phi = \phi _4 \circ \phi _3 \circ \phi _2 \circ \phi _1\), we compose the adjoint Jacobians in the reverse order as follows:
to finally obtain:
Rights and permissions
About this article
Cite this article
Heitz, M., Bonneel, N., Coeurjolly, D. et al. Ground Metric Learning on Graphs. J Math Imaging Vis 63, 89–107 (2021). https://doi.org/10.1007/s10851-020-00996-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10851-020-00996-z