Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1390156.1390283acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

Metric embedding for kernel classification rules

Published: 05 July 2008 Publication History

Abstract

In this paper, we consider a smoothing kernel based classification rule and propose an algorithm for optimizing the performance of the rule by learning the bandwidth of the smoothing kernel along with a data-dependent distance metric. The data-dependent distance metric is obtained by learning a function that embeds an arbitrary metric space into a Euclidean space while minimizing an upper bound on the resubstitution estimate of the error probability of the kernel classification rule. By restricting this embedding function to a reproducing kernel Hilbert space, we reduce the problem to solving a semidefinite program and show the resulting kernel classification rule to be a variation of the k-nearest neighbor rule. We compare the performance of the kernel rule (using the learned data-dependent distance metric) to state-of-the-art distance metric learning algorithms (designed for k-nearest neighbor classification) on some benchmark datasets. The results show that the proposed rule has either better or as good classification accuracy as the other metric learning algorithms.

References

[1]
Devroye, L., Gyorfi, L., & Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. New York: Springer-Verlag.
[2]
Devroye, L., & Krzy&zdot;żak, A. (1989). An equivalence theorem for L<sub>1</sub> convergence of the kernel regression estimate. Journal of Statistical Planning and Inference, 23, 71--82.
[3]
Evgeniou, T., Pontil, M., & Poggio, T. (2000). Regularization networks and support vector machines. Advances in Computational Mathematics, 13, 1--50.
[4]
Globerson, A., & Roweis, S. (2006). Metric learning by collapsing classes. Advances in Neural Information Processing Systems 18 (pp. 451--458). Cambridge, MA: MIT Press.
[5]
Goldberger, J., Roweis, S., Hinton, G., & Salakhutdinov, R. (2005). Neighbourhood components analysis. Advances in Neural Information Processing Systems 17 (pp. 513--520). Cambridge, MA: MIT Press.
[6]
Horst, R., & Thoai, N. V. (1999). D.c. programming: Overview. Journal of Optimization Theory and Applications, 103, 1--43.
[7]
Schölkopf, B., Herbrich, R., & Smola, A. J. (2001). A generalized representer theorem. Proc. of the Annual Conference on Computational Learning Theory (pp. 416--426).
[8]
Schölkopf, B., & Smola, A. J. (2002). Learning with kernels. Cambridge, MA: MIT Press.
[9]
Shalev-Shwartz, S., Singer, Y., & Ng, A. Y. (2004). Online and batch learning of pseudo-metrics. Proc. of 21<sup>st</sup> International Conference on Machine Learning. Banff, Canada.
[10]
Snapp, R. R., & Venkatesh, S. S. (1998). Asymptotic expansions of the k nearest neighbor risk. The Annals of Statistics, 26, 850--878.
[11]
Torresani, L., & Lee, K. (2007). Large margin component analysis. Advances in Neural Information Processing Systems 19 (pp. 1385--1392). Cambridge, MA: MIT Press.
[12]
von Luxburg, U., & Bousquet, O. (2004). Distance-based classification with Lipschitz functions. Journal for Machine Learning Research, 5, 669--695.
[13]
Weinberger, K. Q., Blitzer, J., & Saul, L. K. (2006). Distance metric learning for large margin nearest neighbor classification. Advances in Neural Information Processing Systems 18 (pp. 1473--1480). Cambridge, MA: MIT Press.
[14]
Weinberger, K. Q., & Tesauro, G. (2007). Metric learning for kernel regression. Proc. of the 11<sup>th</sup> International Conference on Artificial Intelligence and Statistics.
[15]
Xing, E. P., Ng, A. Y., Jordan, M. I., & Russell, S. (2003). Distance metric learning with application to clustering with side-information. Advances in Neural Information Processing Systems 15 (pp. 505--512). Cambridge, MA: MIT Press.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICML '08: Proceedings of the 25th international conference on Machine learning
July 2008
1310 pages
ISBN:9781605582054
DOI:10.1145/1390156
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • Pascal
  • University of Helsinki
  • Xerox
  • Federation of Finnish Learned Societies
  • Google Inc.
  • NSF
  • Machine Learning Journal/Springer
  • Microsoft Research: Microsoft Research
  • Intel: Intel
  • Yahoo!
  • Helsinki Institute for Information Technology
  • IBM: IBM

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 July 2008

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

ICML '08
Sponsor:
  • Microsoft Research
  • Intel
  • IBM

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media