Abstract
A methodology is developed for data analysis based on empirically constructed geodesic metric spaces. For a probability distribution, the length along a path between two points can be defined as the amount of probability mass accumulated along the path. The geodesic, then, is the shortest such path and defines a geodesic metric. Such metrics are transformed in a number of ways to produce parametrised families of geodesic metric spaces, empirical versions of which allow computation of intrinsic means and associated measures of dispersion. These reveal properties of the data, based on geometry, such as those that are difficult to see from the raw Euclidean distances. Examples of application include clustering and classification. For certain parameter ranges, the spaces become CAT(0) spaces and the intrinsic means are unique. In one case, a minimal spanning tree of a graph based on the data becomes CAT(0). In another, a so-called “metric cone” construction allows extension to CAT(k) spaces. It is shown how to empirically tune the parameters of the metrics, making it possible to apply them to a number of real cases.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Alexander, L., Jones, P.: Updated precipitation series for the UK and discussion of recent extremes. Atmos. Sci. Lett. 1(2), 142–150 (2000)
Amari, S.I.: Differential-Geometrical Methods in Statistics. Springer, New York (1985)
Amari, S.I., Nagaoka, H.: Methods of Information Geometry. American Mathematical Society, Providence (2007)
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN (2017). arXiv:1701.07875
Ay, N., Jost, J., Lê, H.V., Schwachhöfer, L.: Information Geometry. Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge/A Series of Modern Surveys in Mathematics. Springer, Berlin (2017)
Bache, K., Lichman, M.: UCI Machine Learning Repository (2013). http://archive.ics.uci.edu/ml/. Accessed 9 Jan 2015
Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems 14, pp. 585–591. MIT Press, Cambridge (2002)
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The protein data bank. Nucleic Acids Res. 28(1), 235–242 (2000)
Bhattacharya, A., Bhattacharya, R.: Nonparametric Inference on Manifolds: With Applications to Shape Spaces. Cambridge University Press, Cambridge (2012a)
Bhattacharya, A., Bhattacharya, R.: Nonparametric inference on manifolds: with applications to shape spaces, vol. 2. Cambridge University Press, Cambridge (2012b)
Billera, L.J., Holmes, S.P., Vogtmann, K.: Geometry of the space of phylogenetic trees. Adv. Appl. Math. 27(4), 733–767 (2001)
Bridson, M.R., Haefliger, A.: Metric Spaces of Non-positive Curvature, vol. 319. Springer, Berlin (2011)
Cayton, L.: Algorithms for manifold learning. Univ. Calif. San Diego Tech. Rep. 12(1–17), 1 (2005)
CIESEN, FAO, CIAT: Gridded population of the world, version 3 (GPWV3): population count grid (2005). http://dx.doi.org/10.7927/H4639MPP. Accessed 16 June 2016
Cuturi, M., Doucet, A.: Fast computation of Wasserstein barycenters. In: International Conference on Machine Learning, jmlr.org, pp. 685–693 (2014)
De Berg, M., Van Kreveld, M., Overmars, M., Schwarzkopf, O.C.: Computational Geometry, pp. 1–17. Springer, Berlin (2000)
Deza, M.M., Deza, E.: Encyclopedia of Distances, pp. 1–583. Springer, Berlin (2009)
Dryden, I.L., Mardia, K.V.: Statistical Shape Analysis: With Applications in R. Wiley, Hoboken (2016)
Floyd, R.W.: Algorithm 97: shortest path. Commun. ACM 5(6), 345 (1962)
Gromov, M.: Hyperbolic groups. In: Gersten, S.M. (ed.) Essays in Group Theory, pp. 75–263. Springer, Berlin (1987)
Kendall, D.G.: Shape manifolds, procrustean metrics, and complex projective spaces. Bull. Lond. Math. Soc. 16(2), 81–121 (1984)
Kendall, M.G., Morán, PA.: Geometrical probability. Technical Report (1963)
Kendall, D.G., Barden, D., Carne, T.K., Le, H.: Shape and Shape Theory. Wiley, Hoboken (2009)
Komaki, F.: Shrinkage priors for Bayesian prediction. Ann. Stat. 34(2), 808–819 (2006)
Lin, Y., Lu, L., Yau, S.T.: Ricci curvature of graphs. Tohoku Math. J. 63(4), 605–627 (2011)
Marron, J.S., Alonso, A.M.: Overview of object oriented data analysis. Biom. J. 56(5), 732–753 (2014)
McCullagh, P.: Tensor Methods in Statistics. Courier Dover Publications, Mineola (1986)
Nye, T.M.W.: Principal components analysis in the space of phylogenetic trees. Ann. Stat. 39(5), 2716–2739 (2011)
Okabe, A., Boots, B., Sugihara, K., Chiu, S.N.: Spatial Tessellations: Concepts and Applications of Voronoi Diagrams, vol. 501. Wiley, Hoboken (2009)
Owen, M., Provan, J.S.: A fast algorithm for computing geodesic distances in tree space. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(1), 2–13 (2011)
Panaretos, V.M., Zemel, Y.: Statistical aspects of Wasserstein distances (2018). arXiv:1806.05500
Patrangenaru, V., Ellingson, L.: Nonparametric Statistics on Manifolds and Their Applications to Object Data Analysis. CRC Press, Boca Raton (2015)
Peyré, G., Cuturi, M.: Computational optimal transport (2018). arXiv:1803.00567
Ramsay, J.O., Silverman, B.W.: Applied Functional Data Analysis: Methods and Case Studies. Springer, Berlin (2007)
Saul, L.K.: Think globally, fit locally: unsupervised learning of low dimensional manifolds. J. Mach. Learn. Res. 4, 119–155 (2003)
Solomon, J., de Goes, F., Peyré, G., Cuturi, M., Butscher, A., Nguyen, A., Du, T., Guibas, L.: Convolutional wasserstein distances: efficient optimal transportation on geometric domains. ACM Trans. Gr. 34(4), 66:1–66:11 (2015)
Srivastava, A., Klassen, E.P.: Functional and Shape Data Analysis. Springer Series in Statistics. Springer, Berlin (2016)
Takatsu, A.: Wasserstein geometry of Gaussian measures. Osaka J. Math. 48(4), 1005–1026 (2011)
Tanaka, F., Komaki, F.: A superharmonic prior for the autoregressive process of the second-order. J. Time Ser. Anal. 29(3), 444–452 (2008)
Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Vallender, S.: Calculation of the Wasserstein distance between probability distributions on the line. Theory Probab. Appl. 18(4), 784–786 (1974)
Villani, C.: Optimal Transport: Old and New. Springer, Berlin (2008)
Wang, H., Marron, J.S.: Object oriented data analysis: sets of trees (2007). arXiv:0711.3147
Yang, L., Jin, R.: Distance metric learning: a comprehensive survey. Mich. State Univ. 2(2), 4 (2006)
Acknowledgements
Funding was provided by JST, PRESTO (JPMJPR14E3) and JSPS, KAKENHI (26280009,16K02843), Japan. The first author would like to thank Masayuki Sakai, Takaaki Koike and Tatsuhiro Aoshima for their excellent computation and visualization of the results. He also appreciates Reiko Miyaoka and Hiroshi Kokubu for their helpful and encouraging advice.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
JST PREST, JSPS KAKENHI Grant Numbers 16K02843 and 26280009.
Electronic supplementary material
Below is the link to the electronic supplementary material.
A Proof of Theorem 6
A Proof of Theorem 6
(1) Denote the mapped points of a, b, c and x by the projection \(\tilde{{\mathcal {X}}}_\beta \rightarrow {\mathcal {X}}_\beta \) as A, B, C and X, respectively, as shown in Fig. 14 (left). Denote the origin of the metric cone as O. If the sum of the lengths of the geodesics \({\widetilde{AB}}\), \({\widetilde{AC}}\) and \({\widetilde{BC}}\) in \({\mathcal {X}}_\beta \) exceeds \(2\pi \beta \), it is easy to see that the cone spanned by \({\overline{ab}} \cup {\overline{ac}} \cup {\overline{bc}}\) becomes CAT(0) and \(\varDelta abc\) satisfies the CAT(0) property. Therefore, assume that \(|{\widetilde{AB}}|+|{\widetilde{AC}}|+|{\widetilde{BC}}|\le 2 \pi \beta \).
Next, let \(\varDelta a'b'c'\) be a comparison triangle of \(\varDelta abc\) and let \(x'\) be a point on a geodesic \(\overline{b'c'}\) such that \(|{\overline{bx}}|=|\overline{b'x'}|\). Thus, \(|\overline{a'x'}|< |{\overline{ax}}|\). Arrange the points \(a'\), \(b'\) and \(c'\) in a three-dimensional Euclidean space with origin \(O'\) such that the lengths of \(\overline{O'a'}\), \(\overline{O'b'}\) and \(\overline{O'c'}\) are equal to the lengths of \({\overline{Oa}}\), \({\overline{Ob}}\) and \({\overline{Oc}}\), respectively. Denote the radial projection of \(a'\), \(b'\), \(c'\) and \(x'\) to a unit sphere as \(A'\), \(B'\), \(C'\) and \(X'\), respectively, as shown in Fig. 14 (right). By the definition of a metric cone, \(|{\overline{Ox}}|=|\overline{O'x'}|\) and the geodesics \(\widetilde{A'B'}\), \(\widetilde{A'C'}\), \(\widetilde{B'C'}\) and \(\widetilde{A'X'}\) in the unit sphere are arcs satisfying \(|\widetilde{A'B'}|=|{\widetilde{AB}}|\), \(|\widetilde{A'C'}|=|{\widetilde{AC}}|\), \(|\widetilde{B'C'}|=|{\widetilde{BC}}|\) and \(|\widetilde{B'X'}|=|{\widetilde{BX}}|\).
From the argument above, \(|\widetilde{A'B'}|+|\widetilde{A'C'}|+|\widetilde{B'C'}|= |{\widetilde{AB}}|+|{\widetilde{AC}}|+|{\widetilde{BC}}|\le 2 \pi \). Since the unit sphere has a positive constant curvature and \({\mathcal {X}}_\beta \) is CAT(0), \(|\widetilde{A'X'}|> |{\widetilde{AX}}|\). However, since \(|{\overline{Oa}}|=|\overline{O'a'}|\) and \(|{\overline{Ox}}|=|\overline{O'x'}|\), \(|\widetilde{A'X'}|> |{\widetilde{AX}}|\) implies that \(|\widetilde{a'x'}|> |{\widetilde{ax}}|\) by the property of a metric cone. Thus, \(\varDelta abc\) has CAT(0) property and (1) of the theorem is proved.
(2) Assume that \(0<\beta _1<\beta _2<\infty \) and a metric cone \(\tilde{{\mathcal {X}}}_{\beta _1}\) is not CAT(0) for proving the latter half of the theorem by contradiction. Then, there is a geodesic triangle \(\varDelta a_1 b_1 c_1\) in \(\tilde{{\mathcal {X}}}_{\beta _1}\) and a point \(x_1\) on the geodesic \(\overline{b_1 c_1}\) such that the geodesic \(\overline{a_1 x_1}\) is longer than the corresponding geodesic of a comparison triangle. By defining \(A_1, B_1, C_1, X_1,a'_1,b'_1,c'_1,x'_1,A'_1, B'_1, C'_1\) and \(X'_1\) as above, we can say that \(|\widetilde{A_1 X_1}|>|\widetilde{A'_1 X'_1}|\).
Next, each of \(A_1, B_1, C_1\) and \(X_1\) corresponds to a point in \({\mathcal {X}}_{\beta _1}\) and we can consider the corresponding points \(A_2, B_2, C_2\) and \(X_2\) in the other metric cone \(\tilde{{\mathcal {X}}}_{\beta _2}\). When restricted to \({\mathcal {X}}_{\beta _1}\), a geodesic \(\widetilde{A_1 X_1}\) is just a rescaling of \(\widetilde{A_2 X_2}\) and \(|\widetilde{A_1 X_1}|=\frac{\beta _1}{\beta _2}|\widetilde{A_2 X_2}|\).
Now, \(\varDelta A_2'B_2'C_2'\) is a geodesic triangle on the unit sphere, but after rescaling by \(\frac{\beta _2}{\beta _1}\), we can get a geodesic triangle \(\varDelta A_2''B_2''C_2''\) on a sphere of radius \(\frac{\beta _2}{\beta _1}\) whose edges have the same length as \(\varDelta A_1'B_1'C_1'\). By a known result on spherical triangles with the same edge lengths on different spheres, a larger radius implies a “thinner” triangle and \(|\widetilde{A_1 X_1}|<|\widetilde{A_2'' X_2''}|\) where \(X_2''\) is a point on the geodesic \(\widetilde{B_2'' C_2''}\) such that \(|\widetilde{B_1 X_1}|=|\widetilde{B_2'' X_2''}|\).
Combining all the arguments gives
Select a non-degenerate geodesic triangle in \(\tilde{{\mathcal {X}}}_{\beta _2}\) by selecting arbitrary points \(a_2,b_2\) and \(c_2\) on the geodesics \(\overline{OA_2}\), \(\overline{OB_2}\) and \(\overline{OC_2}\) in \(\tilde{{\mathcal {X}}}_{\beta _2}\), respectively, and let \(x_2\) be the intersection point of \(\overline{OX_2}\) and \(\overline{b_2 c_2}\). Then, by \(|\widetilde{A_2 X_2}|>|\widetilde{A'_2 X'_2}|\), we can say that \(|\overline{a_2 x_2}|>|\overline{a'_2 x'_2}|\). This implies that \(\tilde{{\mathcal {X}}}_{\beta _2}\) is not CAT(0) and (2) of the theorem is proved.
(3) For \(k=0\), the statement holds by (1). For \(k>0\) and \(\beta \le \pi \), it is sufficient to prove for \(\beta =\pi /\sqrt{k}\) by (2). Let \(\varDelta abc\) be a geodesic triangle in \(\tilde{{\mathcal {X}}}_\beta \) and let \(\varDelta ABC\) be a geodesic triangle in \({\mathcal {X}}_\beta \). Let A, B, C be the projection of a, b, c, respectively. If the perimeter of \(\varDelta ABC\) is longer than or equal to \(2\pi \), the cone spanned by the perimeter becomes CAT(0) by the same argument as that for (1). Therefore, \(\varDelta abc\) is CAT(0) and satisfies the CAT(0) property.
If the perimeter of \(\varDelta ABC\) is smaller than \(2\pi \), since \({\mathcal {X}}\) is CAT(k) and \({\mathcal {X}}_\beta \) is CAT(1), for any \(X\in {\widetilde{BC}}\), \({\widetilde{BX}}\) is shorter than the corresponding great arc \(\widetilde{B'X'}\) of a comparison triangle \(\varDelta A'B'C'\), which is a spherical triangle on the unit sphere. Since a comparison triangle \(\varDelta a'b'c'\) of \(\varDelta abc\) can be embedded on the cone spanned by \(\varDelta A'B'C'\), \({\widetilde{bx}}\) is shorter than the corresponding line segment \(\widetilde{b'x'}\). This means the \(\varDelta abc\) satisfies the CAT(0) property. \(\square \)
Rights and permissions
About this article
Cite this article
Kobayashi, K., Wynn, H.P. Empirical geodesic graphs and CAT(k) metrics for data analysis. Stat Comput 30, 1–18 (2020). https://doi.org/10.1007/s11222-019-09855-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-019-09855-3