Nothing Special   »   [go: up one dir, main page]

skip to main content
article

An efficient kernel matrix evaluation measure

Published: 01 November 2008 Publication History

Abstract

We study the problem of evaluating the goodness of a kernel matrix for a classification task. As kernel matrix evaluation is usually used in other expensive procedures like feature and model selections, the goodness measure must be calculated efficiently. Most previous approaches are not efficient except for kernel target alignment (KTA) that can be calculated in O(n^2) time complexity. Although KTA is widely used, we show that it has some serious drawbacks. We propose an efficient surrogate measure to evaluate the goodness of a kernel matrix based on the data distributions of classes in the feature space. The measure not only overcomes the limitations of KTA but also possesses other properties like invariance, efficiency and an error bound guarantee. Comparative experiments show that the measure is a good indication of the goodness of a kernel matrix.

References

[1]
Schölkopf, B. and Smola, A.J., Learning with Kernels. 2002. MIT Press, Cambridge, MA.
[2]
Shawe-Taylor, J. and Cristianini, N., Kernel Methods for Pattern Analysis. 2004. Cambridge University Press, New York, NY, USA.
[3]
Vapnik, N.V., The Nature of Statistical Learning Theory. 2000. Springer, New York, NY.
[4]
Schölkopf, B., Smola, A.J. and Müller, K.-R., Kernel principal component analysis. In: Advances in kernel methods: support vector learning, pp. 327-352.
[5]
Wahba, G., Spline Models for Observational Data. In: CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 59. SIAM, Philadelphia.
[6]
Fine, S. and Scheinberg, K., Efficient SVM training using low-rank kernel representations. J. Mach. Learn. Res. v2. 243-264.
[7]
Ong, C.S., Smola, A.J. and Williamson, R.C., Learning the kernel with hyperkernels. J. Mach. Learn. Res. v6. 1043-1071.
[8]
Cristianini, N., Shawe-Taylor, J., Elisseeff, A. and Kandola, J., On kernel-target alignment. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (Eds.), Advances in Neural Information Processing Systems, MIT Press, Cambridge, MA.
[9]
Lanckriet, G.R.G., Cristianini, N., Bartlett, P., Ghaoui, L.E. and Jordan, M.I., Learning the kernel matrix with semidefinite programming. J. Mach. Learn. Res. v5. 27-72.
[10]
Neumann, J., Schnorr, C. and Steidl, G., Combined SVM-based feature selection and classification. Mach. Learn. v61 i1-3. 129-150.
[11]
Wu, M. and Farquhar, J., A subspace kernel for nonlinear feature extraction. In: Veloso, M.M. (Ed.), International Conference on Artificial Intelligence, pp. 1125-1130.
[12]
Kwok, J.T. and Tsang, I.W., Learning with idealized kernels. In: International Conference on Machine Learning, pp. 400-407.
[13]
Crammer, K., Keshet, J. and Singer, Y., Kernel design using boosting. In: Advances in Neural Information Processing Systems, pp. 537-544.
[14]
Kandola, J., Cristianini, N. and Shawe-Taylor, J., Learning semantic similarity. In: Advances in Neural Information Processing Systems, vol. 15.
[15]
Kandola, J. and Shawe-Taylor, J., Refining kernels for regression and uneven classification problems. In: Bishop, C., Frey, B. (Eds.), Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics,
[16]
Boyd, S. and Vandenberghe, L., Convex Optimization. 2004. Cambridge University Press, New York, NY, USA.
[17]
Meila, M., Data centering in feature space. In: Bishop, C.M., Frey, B.J. (Eds.), Ninth International Workshop on Artificial Intelligence and Statistics,
[18]
Gradshteyn, I.S. and Ryzhik, I.M., Table of Integrals, Series, and Products. 2000. Academic Press, San Diego, CA.
[19]
Gartner, T., Lloyd, J.W. and Flach, P.A., Kernels and distances for structured data. Mach. Learn. J. v57 i3. 205-232.
[20]
Jebara, T., Kondor, R. and Howard, A., Probability product kernels. J. Mach. Learn. Res. v5. 819-844.
[21]
Feller, W., . 1971. An Introduction to Probability Theory and its Applications, 1971.Wiley, New York.
[22]
Japkowicz, N. and Stephen, S., The class imbalance problem: a systematic study. Intell. Data Anal. v6 i5. 429-449.
[23]
Lanckriet, G.R.G., Ghaoui, L.E., Bhattacharyya, C. and Jordan, M.I., A robust minimax approach to classification. J. Mach. Learn. Res. v3. 555-582.
[24]
L. Wang, K.L. Chan, Learning kernel parameters by using class separability measure, in: Advances in Neural Information Processing Systems, Sixth workshop on Kernel Machines, Canada, 2002.
[25]
Fukunaga, K., Introduction to Statistical Pattern Recognition. 1990. second ed. Academic Press, New York.
[26]
Globerson, A. and Roweis, S., Metric learning by collapsing classes. In: Weiss, Y., Schölkopf, B., Platt, J. (Eds.), Advances in Neural Information Processing Systems, vol. 18. MIT Press, Cambridge, MA. pp. 451-458.
[27]
Mika, S., Rätsch, G., Weston, J., Schölkopf, B. and Müller, K.-R., Fisher discriminant analysis with kernels. In: Hu, Y.-H., Larsen, J., Wilson, E., Douglas, S. (Eds.), Neural Networks for Signal Processing IX, IEEE. pp. 41-48.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Pattern Recognition
Pattern Recognition  Volume 41, Issue 11
November, 2008
249 pages

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 November 2008

Author Tags

  1. Class separability measure
  2. Classification
  3. Kernel matrix quality measure
  4. Kernel methods
  5. Kernel target alignment

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)An element-wise kernel learning frameworkApplied Intelligence10.1007/s10489-022-04020-253:8(9531-9547)Online publication date: 9-Aug-2022
  • (2018)Automatic Classifier Selection Based on Classification ComplexityPattern Recognition and Computer Vision10.1007/978-3-030-03338-5_25(292-303)Online publication date: 23-Nov-2018
  • (2017)Efficient kernel selection via spectral analysisProceedings of the 26th International Joint Conference on Artificial Intelligence10.5555/3172077.3172183(2124-2130)Online publication date: 19-Aug-2017
  • (2016)Kernel latent features adaptive extraction and selection method for multi-component non-stationary signal of industrial mechanical deviceNeurocomputing10.1016/j.neucom.2016.07.043216:C(296-309)Online publication date: 5-Dec-2016
  • (2015)An efficient Gaussian kernel optimization based on centered kernel polarization criterionInformation Sciences: an International Journal10.1016/j.ins.2015.06.010322:C(133-149)Online publication date: 20-Nov-2015
  • (2015)Optimizing kernel methods to reduce dimensionality in fault diagnosis of industrial systemsComputers and Industrial Engineering10.1016/j.cie.2015.05.01287:C(140-149)Online publication date: 1-Sep-2015
  • (2015)An overview of kernel alignment and its applicationsArtificial Intelligence Review10.1007/s10462-012-9369-443:2(179-192)Online publication date: 1-Feb-2015
  • (2014)Two methods of selecting Gaussian kernel parameters for one-class SVM and their application to fault detectionKnowledge-Based Systems10.1016/j.knosys.2014.01.02059(75-84)Online publication date: 1-Mar-2014
  • (2013)Eigenvalues perturbation of integral operator for kernel selectionProceedings of the 22nd ACM international conference on Information & Knowledge Management10.1145/2505515.2505584(2189-2198)Online publication date: 27-Oct-2013
  • (2012)A survey of the state of the art in learning the kernelsKnowledge and Information Systems10.5555/3225628.322572231:2(193-221)Online publication date: 1-May-2012
  • Show More Cited By

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media