Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/11841036_29guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Subspace sampling and relative-error matrix approximation: column-row-based methods

Published: 11 September 2006 Publication History

Abstract

Much recent work in the theoretical computer science, linear algebra, and machine learning has considered matrix decompositions of the following form: given an m × n matrix A , decompose it as a product of three matrices, C, U , and R , where C consists of a small number of columns of A, R consists of a small number of rows of A, and U is a small carefully constructed matrix that guarantees that the product CUR is "close" to A . Applications of such decompositions include the computation of matrix "sketches", speeding up kernel-based statistical learning, preserving sparsity in low-rank matrix representation, and improved interpretability of data analysis methods. Our main result is a randomized, polynomial algorithm which, given as input an m × n matrix A , returns as output matrices C, U, R such that || A-CUR || F ≤ (1+ε)|| A-A k || F with probability at least 1-δ. Here, A k is the "best" rank- k approximation (provided by truncating the Singular Value Decomposition of A ), and || X || F is the Frobenius norm of the matrix X. The number of columns in C and rows in R is a low-degree polynomial in k , 1/ε, and log(1/δ). Our main result is obtained by an extension of our recent relative error approximation algorithm for l 2 regression from overconstrained problems to general l 2 regression problems. Our algorithm is simple, and it takes time of the order of the time needed to compute the top k right singular vectors of A . In addition, it samples the columns and rows of A via the method of "subspace sampling," so-named since the sampling probabilities depend on the lengths of the rows of the top singular vectors, and since they ensure that we capture entirely a certain subspace of interest.

References

[1]
M.W. Berry, S.A. Pulatova, and G.W. Stewart. Computing sparse reduced-rank approximations to sparse matrices. Technical Report UMIACS TR-2004-32 CMSC TR-4589, University of Maryland, College Park, MD, 2004.
[2]
R. Bhatia. Matrix Analysis . Springer-Verlag, New York, 1997.
[3]
A. Deshpande, L. Rademacher, S. Vempala, and G. Wang. Matrix approximation and projective clustering via volume sampling. In Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms , pages 1117-1126, 2006.
[4]
P. Drineas, R. Kannan, and M.W. Mahoney. Fast Monte Carlo algorithms for matrices I: Approximating matrix multiplication. To appear in: SIAM Journal on Computing .
[5]
P. Drineas, R. Kannan, and M.W. Mahoney. Fast Monte Carlo algorithms for matrices II: Computing a low-rank approximation to a matrix. To appear in: SIAM Journal on Computing .
[6]
P. Drineas, R. Kannan, and M.W. Mahoney. Fast Monte Carlo algorithms for matrices III: Computing a compressed approximate matrix decomposition. To appear in: SIAM Journal on Computing .
[7]
P. Drineas and M.W. Mahoney. On the Nyström method for approximating a Gram matrix for improved kernel-based learning. Journal of Machine Learning Research , 6:2153-2175, 2005.
[8]
P. Drineas, M.W. Mahoney, and S. Muthukrishnan. Polynomial time algorithm for column-row based relative-error low-rank matrix approximation. Technical Report 2006-04, DIMACS, March 2006.
[9]
P. Drineas, M.W. Mahoney, and S. Muthukrishnan. Sampling algorithms for ???2 regression and applications. In Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms , pages 1127-1136, 2006.
[10]
G.H. Golub and C.F. Van Loan. Matrix Computations . Johns Hopkins University Press, Baltimore, 1989.
[11]
S.A. Goreinov and E.E. Tyrtyshnikov. The maximum-volume concept in approximation by low-rank matrices. Contemporary Mathematics , 280:47-51, 2001.
[12]
S.A. Goreinov, E.E. Tyrtyshnikov, and N.L. Zamarashkin. A theory of pseudoskeleton approximations. Linear Algebra and Its Applications , 261:1-21, 1997.
[13]
R.A. Horn and C.R. Johnson. Matrix Analysis . Cambridge University Press, New York, 1985.
[14]
F.G. Kuruvilla, P.J. Park, and S.L. Schreiber. Vector algebra in the analysis of genome-wide expression data. Genome Biology , 3:research0011.1-0011.11, 2002.
[15]
Z. Lin and R.B. Altman. Finding haplotype tagging SNPs by use of principal components analysis. American Journal of Human Genetics , 75:850-861, 2004.
[16]
M.Z. Nashed, editor. Generalized Inverses and Applications . Academic Press, New York, 1976.
[17]
P. Paschou, M.W. Mahoney, J.R. Kidd, A.J. Pakstis, S. Gu, K.K. Kidd, and P. Drineas. Intra- and inter-population genotype reconstruction from tagging SNPs. Manuscript submitted for publication.
[18]
L. Rademacher, S. Vempala, and G. Wang. Matrix approximation and projective clustering via iterative sampling. Technical Report MIT-LCS-TR-983, Massachusetts Institute of Technology, Cambridge, MA, March 2005.
[19]
G.W. Stewart. Four algorithms for the efficient computation of truncated QR approximations to a sparse matrix. Numerische Mathematik , 83:313-323, 1999.
[20]
G.W. Stewart. Error analysis of the quasi-Gram-Schmidt algorithm. Technical Report UMIACS TR-2004-17 CMSC TR-4572, University of Maryland, College Park, MD, 2004.
[21]
C.K.I. Williams and M. Seeger. Using the Nyström method to speed up kernel machines. In Annual Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference , pages 682-688, 2001.

Cited By

View all
  • (2019)Total least squares regression in input sparsity timeProceedings of the 33rd International Conference on Neural Information Processing Systems10.5555/3454287.3454510(2482-2493)Online publication date: 8-Dec-2019
  • (2019)Relative error tensor low rank approximationProceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3310435.3310607(2772-2789)Online publication date: 6-Jan-2019
  • (2018)An homotopy method for lp regression provably beyond self-concordance and in input-sparsity timeProceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing10.1145/3188745.3188776(1130-1137)Online publication date: 20-Jun-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ESA'06: Proceedings of the 14th conference on Annual European Symposium - Volume 14
September 2006
840 pages
ISBN:3540388753

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 11 September 2006

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Total least squares regression in input sparsity timeProceedings of the 33rd International Conference on Neural Information Processing Systems10.5555/3454287.3454510(2482-2493)Online publication date: 8-Dec-2019
  • (2019)Relative error tensor low rank approximationProceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3310435.3310607(2772-2789)Online publication date: 6-Jan-2019
  • (2018)An homotopy method for lp regression provably beyond self-concordance and in input-sparsity timeProceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing10.1145/3188745.3188776(1130-1137)Online publication date: 20-Jun-2018
  • (2017)Low rank approximation with entrywise l1-norm errorProceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing10.1145/3055399.3055431(688-701)Online publication date: 19-Jun-2017
  • (2015)Sketching for M-estimatorsProceedings of the twenty-sixth annual ACM-SIAM symposium on Discrete algorithms10.5555/2722129.2722192(921-939)Online publication date: 4-Jan-2015
  • (2014)Sketching as a Tool for Numerical Linear AlgebraFoundations and Trends® in Theoretical Computer Science10.1561/040000006010:1–2(1-157)Online publication date: 29-Oct-2014
  • (2013)Low rank approximation and regression in input sparsity timeProceedings of the forty-fifth annual ACM symposium on Theory of Computing10.1145/2488608.2488620(81-90)Online publication date: 1-Jun-2013
  • (2011)Fast Algorithms for Approximating the Singular Value DecompositionACM Transactions on Knowledge Discovery from Data10.1145/1921632.19216395:2(1-36)Online publication date: 1-Feb-2011
  • (2008)Interpretable nonnegative matrix decompositionsProceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/1401890.1401935(345-353)Online publication date: 24-Aug-2008
  • (2008)Unsupervised feature selection for principal components analysisProceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/1401890.1401903(61-69)Online publication date: 24-Aug-2008
  • Show More Cited By

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media