Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1081870.1081960acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Formulating distance functions via the kernel trick

Published: 21 August 2005 Publication History

Abstract

Tasks of data mining and information retrieval depend on a good distance function for measuring similarity between data instances. The most effective distance function must be formulated in a context-dependent (also application-, data-, and user-dependent) way. In this paper, we propose to learn a distance function by capturing the nonlinear relationships among contextual information provided by the application, data, or user. We show that through a process called the "kernel trick," such nonlinear relationships can be learned efficiently in a projected space. Theoretically, we substantiate that our method is both sound and optimal. Empirically, using several datasets and applications, we demonstrate that our method is effective and useful.

References

[1]
C. Aggarwal, A. Hinneburg, and D. Keim. On the surprising behavior of distance metrics in high dimensional spaces. In Proceedings of International Conference on Database Theory, pages 420--434, 2001.]]
[2]
C. C. Aggarwal. Towards systematic design of distance functions for data mining applications. The Ninth ACM SIGKDD International Conference on Knowledge Discovery in Data and Data Mining, 2003.]]
[3]
M. A. Aizerman, E. M. Braverman, and L. I. Rozonoer. Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control, 25:821--837, 1964.]]
[4]
E. Amaldi and V. Kann. On the approximability of minimizing non-zero variables or unsatisfied relations in linear systems. Theoretical Computer Science, 209:237--260, 1998.]]
[5]
K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is "nearest neighbor" meaningful? Lecture Notes in Computer Science, 1999.]]
[6]
E. Y. Chang, S. Tong, K. Goh, and C. Chang. Support vector machine concept-dependent active learning for image retrieval (submitted 2002 accepted 2005). IEEE Transaction on Multimedia.]]
[7]
N. Cristianini, J. Kandola, A. Elisseeff, and J. Shawe-Taylor. On kernel target alignment. Journal Machine Learning Research, 1, 2002.]]
[8]
N. Cristianini, J. Shawe-Taylor, and A. Elisseeff. On kernel-target alignment, 2002.]]
[9]
R. Fagin, R. Kumar, and D. Sivakumar. Efficient similarity search and classification via rank aggregation. In Proceedings of ACM SIGMOD Conference on Management of Data, pages 301--312, June 2003.]]
[10]
M. Fazel. Matrix rank minimization with applications. Ph.D. Thesis, Electrical Engineering Dept, Stanford University, March 2002.]]
[11]
A. Gionis, P. Indyk, and R. Motwani. Similarity search in high dimensions via hashing. In Proceedings of the 25th VLDB Conference, pages 518--529, 1999.]]
[12]
Y. Grandvalet and S. Canu. Adaptive scaling for feature selection in svms. Advances in Neural Information Processing Systems 15, 2003.]]
[13]
H. W. Kuhn and A. W. Tucker. Nonlinear programming. In Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probabilistics, pages 481--492, 1951.]]
[14]
J. T. Kwok and I. W. Tsang. Learning with idealized kernels. In Proceedings of the Twentieth International Conference on Machine Learning, pages 400--407, August 2003.]]
[15]
B. Schölkopf and A. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA, 2002.]]
[16]
S. Tong and E. Chang. Support vector machine active learning for image retrieval. In Proceedings of ACM International Conference on Multimedia, pages 107--118, 2001.]]
[17]
V. Vapnik. Statistical Learning Theroy. John Wiley and Sons, 1998.]]
[18]
T. Wang, Y. Rui, S.-M. Hu, and J.-Q. Sun. Adaptive tree similarity learning for image retrieval, 2003.]]
[19]
G. Wu, E. Y. Chang, and N. Panda. Formulating distances function using the kernel trick (extended version). http://www.mmdb.ece.ucsb.edu/~echang/kdd05-long.pdf, November 2003.]]
[20]
E. Xing, A. Ng, M. Jordan, and S. Russell. Distance metric learning, with application to clustering with side-information. Advances in Neural Information Processing Systems 15, 2003.]]

Cited By

View all
  • (2024)Varroa destructor detection on honey bees using hyperspectral imageryComputers and Electronics in Agriculture10.1016/j.compag.2024.109219224(109219)Online publication date: Sep-2024
  • (2021)Deep learning for machine health prognostics using Kernel-based feature transformationJournal of Intelligent Manufacturing10.1007/s10845-021-01747-633:6(1665-1680)Online publication date: 3-Mar-2021
  • (2020)Time series forecasting based on kernel mapping and high-order fuzzy cognitive mapsKnowledge-Based Systems10.1016/j.knosys.2020.106359(106359)Online publication date: Aug-2020
  • Show More Cited By

Index Terms

  1. Formulating distance functions via the kernel trick

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
      August 2005
      844 pages
      ISBN:159593135X
      DOI:10.1145/1081870
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 21 August 2005

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. distance function
      2. kernel trick

      Qualifiers

      • Article

      Conference

      KDD05

      Acceptance Rates

      Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

      Upcoming Conference

      KDD '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)15
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 16 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Varroa destructor detection on honey bees using hyperspectral imageryComputers and Electronics in Agriculture10.1016/j.compag.2024.109219224(109219)Online publication date: Sep-2024
      • (2021)Deep learning for machine health prognostics using Kernel-based feature transformationJournal of Intelligent Manufacturing10.1007/s10845-021-01747-633:6(1665-1680)Online publication date: 3-Mar-2021
      • (2020)Time series forecasting based on kernel mapping and high-order fuzzy cognitive mapsKnowledge-Based Systems10.1016/j.knosys.2020.106359(106359)Online publication date: Aug-2020
      • (2016)Human performance modeling for manufacturing based on an improved KNN algorithmThe International Journal of Advanced Manufacturing Technology10.1007/s00170-016-8418-684:1-4(473-483)Online publication date: 5-Feb-2016
      • (2015)Kernel Combination Through Genetic Programming for Image ClassificationProgress in Pattern Recognition, Image Analysis, Computer Vision, and Applications10.1007/978-3-319-25751-8_38(314-321)Online publication date: 25-Oct-2015
      • (2014)Image Geo-Localization Based on MultipleNearest Neighbor Feature Matching UsingGeneralized GraphsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2014.229979936:8(1546-1558)Online publication date: 1-Aug-2014
      • (2014)Wireless Tomography in Noisy Environments Using Machine LearningIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2013.224590452:2(956-966)Online publication date: Feb-2014
      • (2013)Cognitive Radio Network for the Smart GridSecurity and Privacy in Smart Grids10.1201/b15240-7(139-184)Online publication date: 24-Jun-2013
      • (2012)Cognitive Radio Network as Wireless Sensor Network (III): Passive target intrusion detection and experimental demonstration2012 IEEE Radar Conference10.1109/RADAR.2012.6212153(0293-0298)Online publication date: May-2012
      • (2012)An affinity-based new local distance function and similarity measure for kNN algorithmPattern Recognition Letters10.1016/j.patrec.2011.10.02133:3(356-363)Online publication date: 1-Feb-2012
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media