Abstract
Nyström method and low-rank linearized Support Vector Machines (SVMs) are two widely used methods for scaling up kernel SVMs, both of which need to sample part of columns of the kernel matrix to reduce the size. However, existing non-uniform sampling methods suffer from at least quadratic time complexity in the number of training data, limiting the scalability of kernel SVMs. In this paper, we propose a parallel sampling method called parallel column subset selection (PCSS) based on the divide-and-conquer strategy, which divides the kernel matrix into several small submatrices and then selects columns in parallel. We prove that PCSS has a (1+\(\epsilon \)) relative-error upper bound with respect to the kernel matrix. Further, we present two approaches to scaling up kernel SVMs by combining PCSS with Nyström method and low-rank linearized SVMs. The results of comparison experiments demonstrate the effectiveness, efficiency and scalability of our approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We assume \(n\mod t =0\) for simplicity, then each submatrix contains n / t columns. In a general case we can also partition \(\mathbf K\) into t submatrices and each contains \(\lfloor n/t \rfloor \) or \(\lceil n/t \rceil \) columns.
- 2.
Here we also assume \(l\mod t =0\) for simplicity. In a general case we can select \(\lfloor l/t \rfloor \) or \(\lceil l/t \rceil \) columns from each submatrix in order to insure that the number of entire selected columns is l.
- 3.
References
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1–27 (2011)
Deshpande, A., Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via volume sampling. Theor. Comput. 2, 225–247 (2006)
Ding, L., Liao, S.: Nyström approximate model selection for LSSVM. In: Advances in Knowledge Discovery and Data Mining - 16th Pacific-Asia Conference (PAKDD 2012), pp. 282–293 (2012)
Drineas, P., Kannan, R., Mahoney, M.W.: Fast monte carlo algorithms for matrices II: computing a low-rank approximation to a matrix. SIAM J. Comput. 36(1), 158–183 (2006)
Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Subspace sampling and relative-error matrix approximation: column-based methods. In: Díaz, J., Jansen, K., Rolim, J.D.P., Zwick, U. (eds.) APPROX 2006 and RANDOM 2006. LNCS, vol. 4110, pp. 316–326. Springer, Heidelberg (2006)
Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Relative-error CUR matrix decompositions. SIAM J. Matrix Anal. Appl. 30(2), 844–881 (2008)
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
Fine, S., Scheinberg, K.: Efficient svm training using low-rank kernel representations. J. Mach. Learn. Res. 2, 243–264 (2002)
Fowlkes, C., Belongie, S., Chung, F., Malik, J.: Spectral grouping using the Nyström method. IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 214–225 (2004)
Golub, G., Van Loan, C.: Matrix Comput. Johns Hopkins University Press, Baltimore (1996)
Guruswami, V., Sinop, A.K.: Optimal column-based low-rank matrix reconstruction. In: Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1207–1214 (2012)
Kumar, S., Mohri, M., Talwalkar, A.: Sampling methods for the Nyström method. J. Mach. Learn. Res. 13, 981–1006 (2012)
Mahoney, M.W., Drineas, P.: CUR matrix decompositions for improved data analysis. Proc. Natl. Acad. Sci. 106(3), 697–702 (2009)
Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2002)
Smola, A.J., Schökopf, B.: Sparse greedy matrix approximation for machine learning. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 911–918 (2000)
Suykens, J., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999)
Tsang, I.W., Kwok, J.T., Cheung, P.M., Cristianini, N.: Core vector machines: fast SVM training on very large data sets. J. Mach. Learn. Res. 6(4), 363–392 (2005)
Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)
Williams, C., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Advances in Neural Information Processing Systems 13 (NIPS 2001), pp. 682–688 (2001)
Zhang, K., Lan, L., Wang, Z., Moerchen, F.: Scaling up kernel SVM on limited resources: a low-rank linearization approach. In: Proceedings of the 15th International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 1425–1434 (2012)
Zhang, K., Tsang, I.W., Kwok, J.T.: Improved Nyström low-rank approximation and error analysis. In: Proceedings of the 25th International Conference on Machine Learning (ICML 2008), pp. 1232–1239 (2008)
Acknowledgements
This work was supported in part by Natural Science Foundation of China under Grant No. 61170019.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Wu, J., Feng, C., Gao, P., Liao, S. (2015). Parallel Column Subset Selection of Kernel Matrix for Scaling up Support Vector Machines. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9530. Springer, Cham. https://doi.org/10.1007/978-3-319-27137-8_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-27137-8_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27136-1
Online ISBN: 978-3-319-27137-8
eBook Packages: Computer ScienceComputer Science (R0)