research-article

Random forests for metric learning with implicit pairwise position dependence

Authors:

Jason J. CorsoAuthors Info & Claims

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 958 - 966

https://doi.org/10.1145/2339530.2339680

Published: 12 August 2012 Publication History

Abstract

Metric learning makes it plausible to learn semantically meaningful distances for complex distributions of data using label or pairwise constraint information. However, to date, most metric learning methods are based on a single Mahalanobis metric, which cannot handle heterogeneous data well. Those that learn multiple metrics throughout the feature space have demonstrated superior accuracy, but at a severe cost to computational efficiency. Here, we adopt a new angle on the metric learning problem and learn a single metric that is able to implicitly adapt its distance function throughout the feature space. This metric adaptation is accomplished by using a random forest-based classifier to underpin the distance function and incorporate both absolute pairwise position and standard relative position into the representation. We have implemented and tested our method against state of the art global and multi-metric methods on a variety of data sets. Overall, the proposed method outperforms both types of method in terms of accuracy (consistently ranked first) and is an order of magnitude faster than state of the art multi-metric methods (16x faster in the worst case).

Supplementary Material

JPG File (311b_t_talk_9.jpg)

Download
16.66 KB

MP4 File (311b_t_talk_9.mp4)

Download
336.49 MB

References

[1]

Y. Amit and D. Geman. Shape quantization and recognition with randomized trees. Neural computation, 9(7):1545--1588, 1997.

Digital Library

[2]

B. Babenko, S. Branson, and S. Belongie. Similarity metrics for categorization: from monolithic to category specific. In International Conference on Computer Vision, pages 293--300, 2009.

[3]

A. Bar-Hillel, T. Hertz, N. Shental, and D. Weinshall. Learning distance functions using equivalence relations. In International Conference on Machine Learning, volume 20, page 11, 2003.

Digital Library

[4]

G. Biau and L. Devroye. On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification. Journal of Multivariate Analysis, 101(10):2499--2518, 2010.

Digital Library

[5]

O. Boiman, E. Shechtman, and M. Irani. In defense of nearest-neighbor based image classification. In Computer Vision and Pattern Recognition. IEEE Conference on, pages 1--8. IEEE, 2008.

[6]

L. Breiman. Random forests. Machine learning, 45(1):5--32, 2001.

Digital Library

[7]

R. Caruana and A. Niculescu-Mizil. An empircal comparison of supervised learning algorithms. In International Conference on Machine Learning, 2006.

Digital Library

[8]

S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application to face verification. In Computer Vision and Pattern Recognition, volume 1, pages 539--546. IEEE, 2005.

Digital Library

[9]

J. Davis, B. Kulis, P. Jain, S. Sra, and I. Dhillon. Information-theoretic metric learning. In International Conference on Machine Learning, pages 209--216, 2007.

Digital Library

[10]

M. Fink. Object classification from a single example utilizing class relevance metrics. In Advances in Neural Information Processing Systems, volume 17, page 449. The MIT Press, 2004.

[11]

A. Frome, Y. Singer, and J. Malik. Image retrieval and classification using local distance functions. Advances in Neural Information Processing Systems, 19:417, 2006.

[12]

A. Frome, Y. Singer, F. Sha, and J. Malik. Learning globally-consistent local distance functions for shape-based image retrieval and classification. In International Conference on Computer Vision, pages 1--8. IEEE, 2007.

[13]

A. Globerson and S. Roweis. Metric learning by collapsing classes. Advances in Neural Information Processing Systems, 18:451, 2006.

[14]

S. Hoi, W. Liu, M. Lyu, and W. Ma. Learning distance metrics with contextual constraints for image retrieval. In Computer Vision and Pattern Recognition, volume 2, pages 2072--2078. IEEE, 2006.

Digital Library

[15]

P. Jain, B. Kulis, and K. Grauman. Fast image search for learned metrics. Computer Vision and Pattern Recognition, 2008.

[16]

G. Lebanon. Metric learning for text documents. Transactions Pattern Analysis and Machine Intelligence, pages 497--508, 2006.

Digital Library

[17]

C. Leistner, A. Saffari, J. Santner, and H. Bischof. Semi-supervised random forests. In Computer Vision, 2009 IEEE 12th International Conference on, pages 506--513. IEEE.

[18]

Y. Lin and Y. Jeon. Random forests and adaptive nearest neighbors. Journal of the American Statistical Association, 101(474):578--590, 2006.

[19]

G. Martinez-Munoz, N. Larios, E. Mortensen, W. Zhang, A. Yamamuro, R. Paasch, N. Payet, D. Lytle, L. Shapiro, S. Todorovic, et al. Dictionary-free categorization of very similar objects via stacked evidence trees. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages 549--556. IEEE, 2009.

[20]

H. Nguyen and L. Bai. Cosine similarity metric learning for face verification. Asian Conference on Computer Vision, pages 709--720, 2010.

Digital Library

[21]

N. Nguyen and Y. Guo. Metric Learning: A Support Vector Approach. ECML PKDD, pages 125--136, 2008.

Digital Library

[22]

N. Payet and S. Todorovic. (rf) 2 -- random forest random field. Advanced in Neural Information Processing Systems, 2010.

[23]

S. Shalev-Shwartz, Y. Singer, and A. Ng. Online and batch learning of pseudo-metrics. In International Conference on Machine Learning, page 94. ACM, 2004.

Digital Library

[24]

C. Shen, J. Kim, and L. Wang. Scalable Large-Margin Mahalanobis Distance Metric Learning. Neural Networks, IEEE Transactions on, 21(9):1524--1530, 2010.

Digital Library

[25]

Y. Shi, Y. Noh, F. Sha, and D. Lee. Learning discriminative metrics via generative models and kernel learning. Arxiv preprint arXiv:1109.3940, 2011.

[26]

J. Wang, S. Wu, H. Vu, and G. Li. Text document clustering with metric learning. In Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 783--784. ACM, 2010.

Digital Library

[27]

K. Weinberger and L. Saul. Fast solvers and efficient implementations for distance metric learning. In International Conference on Machine Learning, pages 1160--1167. ACM, 2008.

Digital Library

[28]

K. Weinberger and L. Saul. Distance metric learning for large margin nearest neighbor classification. The Journal of Machine Learning Research, 10:207--244, 2009.

Digital Library

[29]

L. Wu, R. Jin, S. Hoi, J. Zhu, and N. Yu. Learning bregman distance functions and its application for semi-supervised clustering. Advances in Neural Information Processing Systems, 22:2089--2097, 2009.

[30]

E. Xing, A. Ng, M. Jordan, and S. Russell. Distance metric learning with application to clustering with side-information. Advances in Neural Information Processing Systems, pages 521--528, 2003.

Digital Library

[31]

W. Yang, Y. Wang, and G. Mori. Learning transferable distance functions for human action recognition. Machine Learning for Vision-Based Motion Analysis, pages 349--370, 2011.

[32]

D. Zhan, M. Li, Y. Li, and Z. Zhou. Learning instance specific distances using metric propagation. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 1225--1232. ACM, 2009.

Digital Library

Cited By

Lin ZHuang XChen J(2023)Prediction Model of Sailing Ship Price Based on Random Forest RegressionHighlights in Business, Economics and Management10.54097/hbem.v16i.1053016(29-33)Online publication date: 2-Aug-2023
https://doi.org/10.54097/hbem.v16i.10530
Rudar JGolding GKremer SHajibabaei M(2023)Decision Tree Ensembles Utilizing Multivariate Splits Are Effective at Investigating Beta Diversity in Medically Relevant 16S Amplicon Sequencing DataMicrobiology Spectrum10.1128/spectrum.02065-2211:2Online publication date: 13-Apr-2023
https://doi.org/10.1128/spectrum.02065-22
Jiang WZhang XLi SSong SZhao H(2022)An unbiased kinship estimation method for genetic data analysisBMC Bioinformatics10.1186/s12859-022-05082-223:1Online publication date: 6-Dec-2022
https://doi.org/10.1186/s12859-022-05082-2
Show More Cited By

Index Terms

Random forests for metric learning with implicit pairwise position dependence
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Multiple metric learning via local metric fusion
Abstract
Adaptive distance metric learning based on the characteristics of data can significantly improve the learner’s performance. Due to the limitations of single metric learning for heterogeneous data, multiple local metric learning has ...
Metric Learning from Probabilistic Labels
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Metric learning aims to learn a good distance metric that can capture the relationships among instances, and its importance has long been recognized in many fields. In the traditional settings of metric learning, an implicit assumption is that the ...
Metric forests based on Gaussian mixture model for visual image classification

Visual image classification plays an important role in computer vision and pattern recognition. In this paper, a new random forests method called metric forests is suggested. This method takes the distribution of datasets (including the original dataset ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

August 2012

1616 pages

ISBN:9781450314626

DOI:10.1145/2339530

General Chair:
Qiang Yang
Hong Kong University of Science and Technology
,
Program Chairs:
Deepak Agarwal
LinkedIn
,
Jian Pei
Simon Fraser University

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '12

Sponsor:

KDD '12: The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 12 - 16, 2012

Beijing, China

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

35
Total Citations
View Citations
629
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)2

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lin ZHuang XChen J(2023)Prediction Model of Sailing Ship Price Based on Random Forest RegressionHighlights in Business, Economics and Management10.54097/hbem.v16i.1053016(29-33)Online publication date: 2-Aug-2023
https://doi.org/10.54097/hbem.v16i.10530
Rudar JGolding GKremer SHajibabaei M(2023)Decision Tree Ensembles Utilizing Multivariate Splits Are Effective at Investigating Beta Diversity in Medically Relevant 16S Amplicon Sequencing DataMicrobiology Spectrum10.1128/spectrum.02065-2211:2Online publication date: 13-Apr-2023
https://doi.org/10.1128/spectrum.02065-22
Jiang WZhang XLi SSong SZhao H(2022)An unbiased kinship estimation method for genetic data analysisBMC Bioinformatics10.1186/s12859-022-05082-223:1Online publication date: 6-Dec-2022
https://doi.org/10.1186/s12859-022-05082-2
Bicego MCicalese FMensi A(2021)RatioRF: a novel measure for Random Forest clustering based on the Tversky's Ratio modelIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.3086147(1-1)Online publication date: 2021
https://doi.org/10.1109/TKDE.2021.3086147
Elgui KBianchi PIsson OPortier FMarty R(2020)Metric Learning for Fingerprint RSSI-Localization2020 IEEE/ION Position, Location and Navigation Symposium (PLANS)10.1109/PLANS46316.2020.9110145(1036-1042)Online publication date: Apr-2020
https://doi.org/10.1109/PLANS46316.2020.9110145
Wang RChen JWang YJiao LWang M(2020)SAR Image Change Detection via Spatial Metric Learning With an Improved Mahalanobis DistanceIEEE Geoscience and Remote Sensing Letters10.1109/LGRS.2019.291525117:1(77-81)Online publication date: Jan-2020
https://doi.org/10.1109/LGRS.2019.2915251
Wang ZZuo RJing L(2020)Fusion of Geochemical and Remote-Sensing Data for Lithological Mapping Using Random Forest Metric LearningMathematical Geosciences10.1007/s11004-020-09897-8Online publication date: 31-Oct-2020
https://doi.org/10.1007/s11004-020-09897-8
Xu XCao HYang YYang EDeng C(2019)Zero-shot metric learningProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367471.3367597(3996-4002)Online publication date: 10-Aug-2019
https://dl.acm.org/doi/10.5555/3367471.3367597
Shah SAngel YHouborg RAli SMcCabe M(2019)A Random Forest Machine Learning Approach for the Retrieval of Leaf Chlorophyll Content in WheatRemote Sensing10.3390/rs1108092011:8(920)Online publication date: 16-Apr-2019
https://doi.org/10.3390/rs11080920
Elgui KBianchi PPortier FIsson O(2019)Learning Methods for RSSI-based Geolocation: A Comparative Study2019 27th European Signal Processing Conference (EUSIPCO)10.23919/EUSIPCO.2019.8903160(1-5)Online publication date: Sep-2019
https://doi.org/10.23919/EUSIPCO.2019.8903160
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents