From Geometry to Topology: Inverse Theorems for Distributed Persistence

Abstract

What is the "right" topological invariant of a large point cloud X? Prior research has focused on estimating the full persistence diagram of X, a quantity that is very expensive to compute, unstable to outliers, and far from injective. We therefore propose that, in many cases, the collection of persistence diagrams of many small subsets of X is a better invariant. This invariant, which we call "distributed persistence," is perfectly parallelizable, more stable to outliers, and has a rich inverse theory. The map from the space of metric spaces (with the quasi-isometry distance) to the space of distributed persistence invariants (with the Hausdorff-Bottleneck distance) is globally bi-Lipschitz. This is a much stronger property than simply being injective, as it implies that the inverse image of a small neighborhood is a small neighborhood, and is to our knowledge the only result of its kind in the TDA literature. Moreover, the inverse Lipschitz constant depends on the size of the subsets taken, so that as the size of these subsets goes from small to large, the invariant interpolates between a purely geometric one and a topological one. Lastly, we note that our inverse results do not actually require considering all subsets of a fixed size (an enormous collection), but a relatively small collection satisfying simple covering properties. These theoretical results are complemented by synthetic experiments demonstrating the use of distributed persistence in practice.

Henry Adams, Tegan Emerson, Michael Kirby, Rachel Neville, Chris Peterson, Patrick Shipman, Sofya Chepushtanova, Eric Hanson, Francis Motta, and Lori Ziegelmeier. Persistence images: A stable vector representation of persistent homology. The Journal of Machine Learning Research, 18(1):218-252, 2017.
Andrew J. Blumberg, Itamar Gal, Michael A. Mandell, and Matthew Pancia. Robust statistics, hypothesis testing, and confidence intervals for persistent homology on metric measure spaces. Foundations of Computational Mathematics, 14(4):745-789, 2014. URL: https://doi.org/10.1007/s10208-014-9201-4.
Peter Bubenik. The persistence landscape and some of its properties. In Nils A. Baas, Gunnar E. Carlsson, Gereon Quick, Markus Szymik, and Marius Thaule, editors, Topological Data Analysis, pages 97-117, Cham, 2020. Springer International Publishing.
Peter Bubenik, Michael Hull, Dhruv Patel, and Benjamin Whittle. Persistent homology detects curvature. Inverse Problems, 36(2):025008, January 2020. URL: https://doi.org/10.1088/1361-6420/ab4ac0.
Peter Bubenik and Alexander Wagner. Embeddings of persistence diagrams into hilbert spaces. Journal of Applied and Computational Topology, 4(3):339-351, 2020. URL: https://doi.org/10.1007/s41468-020-00056-w.
Gunnar Carlsson. Topology and data. Bulletin of the American Mathematical Society, 46(2):255-308, 2009.
Mathieu Carrière and Ulrich Bauer. On the metric distortion of embedding persistence diagrams into separable Hilbert spaces. In 35th International Symposium on Computational Geometry, volume 129 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. No. 21, 15. Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, 2019. URL: https://mathscinet.ams.org/mathscinet-getitem?mr=3968607.
Frédéric Chazal, Vin De Silva, Marc Glisse, and Steve Oudot. The structure and stability of persistence modules. Springer, 2016.
Frédéric Chazal, Brittany Fasy, Fabrizio Lecci, Bertrand Michel, Alessandro Rinaldo, and Larry Wasserman. Subsampling methods for persistent homology. In Francis Bach and David Blei, editors, Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 2143-2151, Lille, France, 07-09 July 2015. PMLR. URL: http://proceedings.mlr.press/v37/chazal15.html.
David Cohen-Steiner, Herbert Edelsbrunner, and John Harer. Stability of persistence diagrams. Discrete & computational geometry, 37(1):103-120, 2007.
Justin Curry. The fiber of the persistence map for functions on the interval. Journal of Applied and Computational Topology, 2(3):301-321, 2018.
Justin Curry, Sayan Mukherjee, and Katharine Turner. How many directions determine a shape and other sufficiency results for two topological transforms. arXiv preprint, 2018. URL: http://arxiv.org/abs/1805.09782.
Vin de Silva and Robert Ghrist. Coverage in sensor networks via persistent homology. Algebraic & Geometric Topology, 7(1):339-358, 2007.
Irene Donato, Matteo Gori, Marco Pettini, Giovanni Petri, Sarah De Nigris, Roberto Franzosi, and Francesco Vaccarino. Persistent homology analysis of phase transitions. Phys. Rev. E, 93:052138, May 2016. URL: https://doi.org/10.1103/PhysRevE.93.052138.
Herbert Edelsbrunner and John Harer. Computational Topology: an Introduction. American Mathematical Society, 2010.
Robert Ghrist. Barcodes: the persistent topology of data. Bulletin of the American Mathematical Society, 45(1):61-75, 2008.
Robert Ghrist, Rachel Levanger, and Huy Mai. Persistent homology and euler integral transforms. Journal of Applied and Computational Topology, 2(1):55-60, 2018.
Mario Gómez and Facundo Mémoli. Curvature sets over persistence diagrams. arXiv preprint, 2021. URL: http://arxiv.org/abs/2103.04470.
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint, 2014. URL: http://arxiv.org/abs/1412.6980.
Henry Kirveslahti and Sayan Mukherjee. Representing fields without correspondences: the lifted euler characteristic transform. arXiv preprint, 2021. URL: http://arxiv.org/abs/2111.04788.
J. B. Kruskal. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29(1):1-27, 1964. URL: https://doi.org/10.1007/BF02289565.
Jacob Leygonie and Gregory Henselman-Petrusek. Algorithmic reconstruction of the fiber of persistent homology on cell complexes. arXiv preprint, 2021. URL: http://arxiv.org/abs/2110.14676.
Jacob Leygonie and Ulrike Tillmann. The fiber of persistent homology for simplicial complexes. arXiv preprint, 2021. URL: http://arxiv.org/abs/2104.01372.
Sayan Mandal, Aldo Guzmán-Sáenz, Niina Haiminen, Saugata Basu, and Laxmi Parida. A topological data analysis approach on predicting phenotypes from gene expression data. In Carlos Martín-Vide, Miguel A. Vega-Rodríguez, and Travis Wheeler, editors, Algorithms for Computational Biology, pages 178-187, Cham, 2020. Springer International Publishing.
Leland McInnes, John Healy, and James Melville. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint, 2018. URL: http://arxiv.org/abs/1802.03426.
Facundo Mémoli. Some properties of gromov-hausdorff distances. Discrete & Computational Geometry, 48(2):416-440, 2012.
Nikola Milosavljević, Dmitriy Morozov, and Primoz Skraba. Zigzag persistent homology in matrix multiplication time. In Proceedings of the twenty-seventh Annual Symposium on Computational Geometry, pages 216-225, 2011.
Steve Oudot and Elchanan Solomon. Inverse problems in topological persistence. In Topological Data Analysis, pages 405-433. Springer, 2020.
Steve Oudot and Elchanan Solomon. Barcode embeddings for metric graphs. Algebraic & Geometric Topology, 21(3):1209-1266, 2021. URL: https://doi.org/10.2140/agt.2021.21.1209.
Steve Y Oudot. Persistence theory: from quiver representations to data analysis, volume 209. American Mathematical Society Providence, 2015.
Sam T. Roweis and Lawrence K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323-2326, 2000. URL: https://doi.org/10.1126/science.290.5500.2323.
IJ Schoenberg. Remarks to maurice fréchet’s article "sur la définition axiomatique d'une classe d'espace distanciés vector-iellement applicable sur l'espace de hilbert". Ann. of Math, 36:724-732, 1935.
Gurjeet Singh, Facundo Memoli, and Gunnar Carlsson. Topological Methods for the Analysis of High Dimensional Data Sets and 3D Object Recognition. In M. Botsch, R. Pajarola, B. Chen, and M. Zwicker, editors, Eurographics Symposium on Point-Based Graphics. The Eurographics Association, 2007. URL: https://doi.org/10.2312/SPBG/SPBG07/091-100.
Joshua B. Tenenbaum, Vin de Silva, and John C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319-2323, 2000. URL: https://doi.org/10.1126/science.290.5500.2319.
Katharine Turner, Sayan Mukherjee, and Doug M Boyer. Persistent homology transform for modeling shapes and surfaces. Information and Inference: A Journal of the IMA, 3(4):310-344, 2014.
Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of Machine Learning Research, 9(86):2579-2605, 2008. URL: http://jmlr.org/papers/v9/vandermaaten08a.html.
Alexander Wagner. Nonembeddability of persistence diagrams with p > 2 wasserstein metric. Proceedings of the American Mathematical Society, 149(6):2673-2677, 2021.
Alexander Wagner, Elchanan Solomon, and Paul Bendich. Improving metric dimensionality reduction with distributed topology. arXiv preprint, 2021. URL: http://arxiv.org/abs/2106.07613.
Simon Zhang, Mengbai Xiao, Chengxin Guo, Liang Geng, Hao Wang, and Xiaodong Zhang. Hypha: A framework based on separation of parallelisms to accelerate persistent homology matrix reduction. In Proceedings of the ACM International Conference on Supercomputing, ICS '19, pages 69-81, New York, NY, USA, 2019. Association for Computing Machinery. URL: https://doi.org/10.1145/3330345.3332147.

From Geometry to Topology: Inverse Theorems for Distributed Persistence

Authors Elchanan Solomon , Alexander Wagner , Paul Bendich

File

Document Identifiers

Author Details

Cite As Get BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

From Geometry to Topology: Inverse Theorems for Distributed Persistence

Authors Elchanan Solomon , Alexander Wagner , Paul Bendich

File

Document Identifiers

Author Details

Funding

Cite As Get BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Related Versions

Supplementary Materials

References

Thanks for your feedback!

Could not send message