Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Sparser Johnson-Lindenstrauss Transforms

Published: 01 January 2014 Publication History

Abstract

We give two different and simple constructions for dimensionality reduction in 2 via linear mappings that are sparse: only an O(ε)-fraction of entries in each column of our embedding matrices are non-zero to achieve distortion 1 + ε with high probability, while still achieving the asymptotically optimal number of rows. These are the first constructions to provide subconstant sparsity for all values of parameters, improving upon previous works of Achlioptas [2003] and Dasgupta et al. [2010]. Such distributions can be used to speed up applications where 2 dimensionality reduction is used.

References

[1]
Dimitris Achlioptas. 2003. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J. Comput. Syst. Sci. 66, 4, 671--687.
[2]
Nir Ailon and Bernard Chazelle. 2009. The fast Johnson--Lindenstrauss transform and approximate nearest neighbors. SIAM J. Comput. 39, 1, 302--322.
[3]
Nir Ailon and Edo Liberty. 2009. Fast dimension reduction using Rademacher series on dual BCH codes. Disc. Comput. Geom. 42, 4, 615--630.
[4]
Nir Ailon and Edo Liberty. 2013. An almost optimal unrestricted fast Johnson-Lindenstrauss transform. ACM Trans. Algor. 9, 3, 21.
[5]
Noga Alon. 2003. Problems and results in extremal combinatorics I. Discrete Math. 273, 1--3, 31--53.
[6]
Rosa I. Arriaga and Santosh Vempala. 2006. An algorithmic theory of learning: Robust concepts and random projection. Mach. Learn. 63, 2, 161--182.
[7]
Vladimir Braverman, Rafail Ostrovsky, and Yuval Rabani. 2010. Rademacher chaos, Random Eulerian graphs and the sparse Johnson-Lindenstrauss transform. CoRR abs/1011.2590.
[8]
J. Lawrence Carter and Mark N. Wegman. 1979. Universal classes of hash functions. J. Comput. Syst. Sci. 18, 2, 143--154.
[9]
Moses Charikar, Kevin Chen, and Martin Farach-Colton. 2004. Finding frequent items in data streams. Theoret. Comput. Sci. 312, 1, 3--15.
[10]
Kenneth L. Clarkson and David P. Woodruff. 2009. Numerical linear algebra in the streaming model. In Proceedings of the 41st ACM Symposium on Theory of Computing (STOC). 205--214.
[11]
Kenneth L. Clarkson and David P. Woodruff. 2013. Low rank approximation and regression in input sparsity time. In Proceedings of the 45th ACM Symposium on Theory of Computing (STOC). 81--90.
[12]
Anirban Dasgupta, Ravi Kumar, and Tamás Sarlós. 2010. A sparse Johnson-Lindenstrauss transform. In Proceedings of the 42nd ACM Symposium on Theory of Computing (STOC). 341--350.
[13]
Sanjoy Dasgupta and Anupam Gupta. 2003. An elementary proof of a theorem of Johnson and Lindenstrauss. Random Struct. Algor. 22, 1, 60--65.
[14]
Peter Frankl and Hiroshi Maehara. 1988. The Johnson-Lindenstrauss lemma and the sphericity of some graphs. J. Comb. Theory Ser. B 44, 3, 355--362.
[15]
Zoltán Füredi and János Komlós. 1981. The eigenvalues of random symmetric matrices. Combinatorica 1, 3, 233--241.
[16]
David Lee Hanson and Farroll Tim Wright. 1971. A bound on tail probabilities for quadratic forms in independent random variables. Ann. Math. Statist. 42, 3, 1079--1083.
[17]
Sariel Har-Peled, Piotr Indyk, and Rajeev Motwani. 2012. Approximate nearest neighbor: Towards removing the curse of dimensionality. Theory Comput. 8, 1, 321--350.
[18]
Aicke Hinrichs and Jan Vybíral. 2011. Johnson-Lindenstrauss lemma for circulant matrices. Random Struct. Algor. 39, 3, 391--398.
[19]
Piotr Indyk. 2001. Algorithmic applications of low-distortion geometric embeddings. In Proceedings of the 42nd Annual Symposium on Foundations of Computer Science (FOCS). 10--33.
[20]
T. S. Jayram and David P. Woodruff. 2013. Optimal bounds for Johnson-Lindenstrauss transforms and streaming problems with subconstant error. ACM Trans. Algor. 9, 3, 26.
[21]
William B. Johnson and Joram Lindenstrauss. 1984. Extensions of Lipschitz mappings into a Hilbert space. Contemp. Math. 26, 189--206.
[22]
Daniel M. Kane, Raghu Meka, and Jelani Nelson. 2011a. Almost optimal explicit Johnson-Lindenstrauss transformations. In Proceedings of the 15th International Workshop on Randomization and Computation (RANDOM). 628--639.
[23]
Daniel M. Kane and Jelani Nelson. 2010. A derandomized sparse Johnson-Lindenstrauss transform. CoRR abs/1006.3585.
[24]
Daniel M. Kane and Jelani Nelson. 2012. Sparser Johnson-Lindenstrauss transforms. In Proceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 1195--1206.
[25]
Daniel M. Kane, Jelani Nelson, Ely Porat, and David P. Woodruff. 2011b. Fast moment estimation in data streams in optimal space. In Proceedings of the 43rd ACM Symposium on Theory of Computing (STOC). 745--754.
[26]
Eyal Kaplan, Moni Naor, and Omer Reingold. 2009. Derandomized constructions of k-wise (almost) independent permutations. Algorithmica 55, 1, 113--133.
[27]
Felix Krahmer and Rachel Ward. 2011. New and improved Johnson-Lindenstrauss embeddings via the Restricted Isometry Property. SIAM J. Math. Anal. 43, 3, 1269--1281.
[28]
Jirí Matousek. 2008. On variants of the Johnson-Lindenstrauss lemma. Rand. Struct. Algor. 33, 2, 142--156.
[29]
Xiangrui Meng and Michael W. Mahoney. 2013. Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression. In Proceedings of the 45th ACM Symposium on Theory of Computing (STOC). 91--100.
[30]
Rajeev Motwani and Prabakar Raghavan. 1995. Randomized Algorithms. Cambridge University Press.
[31]
Jelani Nelson and Huy L. Nguyễn. 2013a. OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings. In Proceedings of the 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS). 117--126.
[32]
Jelani Nelson and Huy L. Nguyễn. 2013b. Sparsity lower bounds for dimensionality reducing maps. In Proceedings of the 45th ACM Symposium on Theory of Computing (STOC). 101--110.
[33]
Vipin Kumar Pang-Ning Tan and Michael Steinbach. 2005. Introduction to Data Mining. Addison-Wesley.
[34]
Tamás Sarlós. 2006. Improved Approximation Algorithms for large matrices via random projections. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS). 143--152.
[35]
Mikkel Thorup and Yin Zhang. 2012. Tabulation-based 5-independent hashing with applications to linear probing and second moment estimation. SIAM J. Comput. 41, 2, 293--331.
[36]
Santosh Vempala. 2004. The Random Projection Method. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Vol. 65, American Mathematical Society.
[37]
Joachim von zur Gathen and Jürgen Gerhard. 1999. Modern Computer Algebra. Cambridge University Press.
[38]
Jan Vybíral. 2011. A variant of the Johnson-Lindenstrauss lemma for circulant matrices. J. Funct. Anal. 260, 4, 1096--1105.
[39]
Kilian Q. Weinberger, Anirban Dasgupta, John Langford, Alexander J. Smola, and Josh Attenberg. 2009. Feature hashing for large scale multitask learning. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML). 1113--1120.
[40]
Eugene P. Wigner. 1955. Characteristic vectors of bordered matrices with infinite dimensions. Ann. Math. 62, 548--564.

Cited By

View all
  • (2024)Fast Fusion Clustering via Double Random ProjectionEntropy10.3390/e2605037626:5(376)Online publication date: 28-Apr-2024
  • (2024)Hyperdimensional computing: A fast, robust, and interpretable paradigm for biological dataPLOS Computational Biology10.1371/journal.pcbi.101242620:9(e1012426)Online publication date: 24-Sep-2024
  • (2024)Efficiency of Stochastic Coordinate Proximal Gradient Methods on Nonseparable Composite OptimizationMathematics of Operations Research10.1287/moor.2023.0044Online publication date: 16-Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Journal of the ACM
Journal of the ACM  Volume 61, Issue 1
January 2014
222 pages
ISSN:0004-5411
EISSN:1557-735X
DOI:10.1145/2578041
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 January 2014
Accepted: 01 August 2013
Revised: 01 August 2013
Received: 01 February 2012
Published in JACM Volume 61, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Dimensionality reduction
  2. Johnson-Lindenstrauss
  3. numerical linear algebra
  4. streaming algorithms

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)124
  • Downloads (Last 6 weeks)13
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Fast Fusion Clustering via Double Random ProjectionEntropy10.3390/e2605037626:5(376)Online publication date: 28-Apr-2024
  • (2024)Hyperdimensional computing: A fast, robust, and interpretable paradigm for biological dataPLOS Computational Biology10.1371/journal.pcbi.101242620:9(e1012426)Online publication date: 24-Sep-2024
  • (2024)Efficiency of Stochastic Coordinate Proximal Gradient Methods on Nonseparable Composite OptimizationMathematics of Operations Research10.1287/moor.2023.0044Online publication date: 16-Apr-2024
  • (2024)RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor SearchProceedings of the ACM on Management of Data10.1145/36549702:3(1-27)Online publication date: 30-May-2024
  • (2024)Stochastic Trust-Region Algorithm in Random Subspaces with Convergence and Expected Complexity AnalysesSIAM Journal on Optimization10.1137/22M152407234:3(2671-2699)Online publication date: 25-Jul-2024
  • (2024)Fast Metric Embedding into the Hamming CubeSIAM Journal on Computing10.1137/22M152022053:2(315-345)Online publication date: 14-Mar-2024
  • (2024)On Design Choices in Similarity-Preserving Sparse Randomized Embeddings2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10651277(1-8)Online publication date: 30-Jun-2024
  • (2024)Jacobian sparsity detection using Bloom filtersOptimization Methods and Software10.1080/10556788.2023.2285486(1-13)Online publication date: 10-Oct-2024
  • (2024)Fast randomized numerical rank estimation for numerically low-rank matricesLinear Algebra and its Applications10.1016/j.laa.2024.01.001686(1-32)Online publication date: Apr-2024
  • (2023)Reward imputation with sketching for contextual batched banditsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668940(64577-64588)Online publication date: 10-Dec-2023
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media