research-article

Learning similarity with cosine similarity ensemble

Authors:

Fanzhang LiAuthors Info & Claims

Information Sciences—Informatics and Computer Science, Intelligent Systems, Applications: An International Journal, Volume 307, Issue C

Pages 39 - 52

https://doi.org/10.1016/j.ins.2015.02.024

Published: 20 June 2015 Publication History

Abstract

This paper proposes a cosine similarity ensemble (CSE) method to learn similarity.CSE is a selective ensemble and combines multiple cosine similarity learners.A learner redefines the pattern vectors and determines its threshold adaptively.Experimental results show the superiority of CSE. There is no doubt that similarity is a fundamental notion in the field of machine learning and pattern recognition. How to represent and measure similarity appropriately is a pursuit of many researchers. Many tasks, such as classification and clustering, can be accomplished perfectly when a similarity metric is well-defined. Cosine similarity is a widely used metric that is both simple and effective. This paper proposes a cosine similarity ensemble (CSE) method for learning similarity. In CSE, diversity is guaranteed by using multiple cosine similarity learners, each of which makes use of a different initial point to define the pattern vectors used in its similarity measures. The CSE method is not limited to measuring similarity using only pattern vectors that start at the origin. In addition, the thresholds of these separate cosine similarity learners are adaptively determined. The idea of using a selective ensemble is also implemented in CSE, and the proposed CSE method outperforms other compared methods on various data sets.

References

[1]

P.L. Bartlett, S. Mendelson, Rademacher and gaussian complexities: risk bounds and structural results, J. Mach. Learn. Res., 3 (2003) 463-482.

Digital Library

[2]

C.J.C. Burges, A tutorial on support vector machines for pattern recognition, Data Min. Knowl. Discov., 2 (1998) 121-167.

Digital Library

[3]

S. Chakrabarti, Data mining for hypertext: a tutorial survey, ACM SIGKDD Explor. Newslett., 1 (2000) 1-11.

Digital Library

[4]

N.V. Chawla, N. Japkowicz, A. Kolcz, Editorial: special issue on learning from imbalanced datasets, ACM SIGKDD Explor. Newslett., 6 (2004) 1-6.

Digital Library

[5]

H. Chen, P. Tiňo, X. Yao, Predictive ensemble pruning by expectation propagation, IEEE Trans. Knowl. Data Eng., 21 (2009) 999-1013.

Digital Library

[6]

L.B. Chen, Y.N. Wang, B.G. Hu, Kernel-based similarity learning, Mach. Learn. Cybernet., 4 (2002) 2152-2156.

[7]

S. Chopra, R. Hadsell, Y. LeCun, Learning a similarity metric discriminatively, with application to face verification, Comput. Vis. Pattern Recogn., 1 (2005) 539-546.

[8]

T. Cover, P. Hart, Nearest neighbor pattern classification, IEEE Trans. Inf. Theory, 13 (1967) 21-27.

Digital Library

[9]

N. Cristianini, J. Shawe-Taylor, Support Vector Machine, Cambridge University Press, 2000.

[10]

J.E. Dayhoff, J.M. DeLeo, Artificial neural networks, Cancer (2001) 1615-1635.

[11]

F. Fleuret, G. Blanchard, Pattern recognition from one example by chopping, Adv. Neural Inf. Process. Syst. (2005) 371-378.

[12]

G. Fumera, F. Roli, A theoretical and experimental analysis of linear combiners for multiple classifier systems, IEEE Trans. Pattern Anal. Mach. Intell., 27 (2005) 942-956.

Digital Library

[13]

H. He, E.A. Garcia, Learning from imbalanced data, IEEE Trans. Knowl. Data Eng., 21 (2009) 1263-1284.

Digital Library

[14]

M.R. Hestenes, E. Stiefel, Methods of conjugate gradients for solving linear systems, J. Res. Natl. Bur. Stand. (1952).

[15]

J.J. Hopfield, Artificial neural networks, IEEE Circ. Dev. Mag., 4 (1988) 3-10.

[16]

J. Kittler, M. Hatef, R.P.W. Duin, J. Matas, On combining classifiers, IEEE Trans. Pattern Anal. Mach. Intell, 20 (1998) 226-239.

Digital Library

[17]

T. Korenius, J. Laurikkala, M. Juhola, On principal component analysis, cosine and euclidean measures in information retrieval, Inf. Sci., 177 (2007) 4893-4905.

[18]

D.D. Margineantu, T.G. Dietterich, Pruning adaptive boosting, in: Proceedings of the 14th International Conference on Machine Learning, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1997, pp. 211-218.

[19]

G. Martínez-Munoz, D. Hernández-Lobato, A. Suárez, An analysis of ensemble pruning techniques based on ordered aggregation, IEEE Trans. Pattern Anal. Mach. Intell., 31 (2009) 245-259.

Digital Library

[20]

A. Maurer, Learning similarity with operator-valued large-margin classifiers, J. Mach. Learn. Res., 9 (2008) 1049-1082.

Digital Library

[21]

S. Melacci, L. Sarti, M. Maggini, M. Bianchini, A neural network approach to similarity learning, Springer, Berlin, Heidelberg, 2008.

[22]

S.A. Nene, S.K. Nayar, H. Murase, Columbia object image library (coil-100), Technical Report CUCS-006-96, Department of Comp. Science, Columbia University, 1996.

[23]

H.V. Nguyen, L. Bai, Cosine similarity metric learning for face verification, Springer, Berlin, Heidelberg, 2011.

[24]

P.J. Phillips, Support vector machines applied to face recognition, Adv. Neural Inf. Process. Syst., 11 (1998) 803-809.

[25]

F.S. Samaria, A.C. Harter, Parameterisation of a stochastic model for human face identification, in: Proceedings of the Second IEEE Workshop on Applications of Computer, 1994, pp. 138-142.

[26]

B. Sarwar, G. Karypis, J Konstan, J. Riedl, Item-based collaborative filtering recommendation algorithms, in: Proceedings of the 10th International Conference on World Wide Web, ACM, 2001, pp. 285-295.

[27]

Y. Tang, L. Li, X. Li, Learning similarity with multikernel method, IEEE Trans. Syst. Man Cybernet. B: Cybernet, 41 (2011) 131-138.

Digital Library

[28]

N. Ueda, Optimal linear combination of neural networks for improving classification performance, IEEE Trans. Pattern Anal. Mach. Intell., 22 (2000) 207-215.

Digital Library

[29]

L.W. Wang, Y. Zhang, J. Feng, On the euclidean distance of images, IEEE Trans. Pattern Anal. Mach. Intell., 27 (2005) 1334-1339.

Digital Library

[30]

G. Weiss, Mining with rarity: a unifying framework, ACM SIGKDD Explor. Newslett., 6 (2004) 7-19.

Digital Library

[31]

B. Yegnanarayana, Artificial Neural Networks, PHI Learning Pvt. Ltd., 2004.

[32]

L. Zhang, Research on support vector machines and kernel methods (Ph.D. thesis), Xidian University, China, 2009.

[33]

L. Zhang, W.D. Zhou, Sparse ensembles using weighted combination methods based on linear programming, Pattern Recogn., 44 (2001) 97-106.

[34]

Z.H. Zhou, J. Wu, W. Tang, Ensembling neural networks: many could be better than all, Artif. Intell., 137 (2002) 239-263.

Digital Library

Cited By

Zhang HChen RWen SBian X(2025)SWIMFuture Generation Computer Systems10.1016/j.future.2024.107590164:COnline publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1016/j.future.2024.107590
Grazian CJin QTangari G(2025)Assessing the invertibility of deep biometric representationsExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.125848264:COnline publication date: 10-Mar-2025
https://dl.acm.org/doi/10.1016/j.eswa.2024.125848
Zhou XTan GWang HMa YWu S(2025)Artificial bee colony algorithm based on multi-neighbor guidanceExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.125283259:COnline publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1016/j.eswa.2024.125283
Show More Cited By

Learning similarity with cosine similarity ensemble
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning

Recommendations

Matching Scientific Article Titles using Cosine Similarity and Jaccard Similarity Algorithm
Abstract
This study compared various methods for academic article similarity matching. We employed two similarity algorithms, specifically Cosine and Jaccard Similarity. Moreover, these two similarity algorithms were combined with TF-IDF to increase the ...
Similarity Learning for Nearest Neighbor Classification
ICDM '08: Proceedings of the 2008 Eighth IEEE International Conference on Data Mining

In this paper, we propose an algorithm for learning a general class of similarity measures for kNN classification. This class encompasses, among others, the standard cosine measure, as well as the Dice and Jaccard coefficients. The algorithm we propose ...
Improved cosine similarity measures of simplified neutrosophic sets for medical diagnoses

We proposed improved cosine similarity measures of simplified neutrosophic sets (SNSs) based on cosine function, including single valued neutrosophic cosine similarity measures and interval neutrosophic cosine similarity measures, to overcome some ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Information Sciences: an International Journal

Information Sciences: an International Journal Volume 307, Issue C

June 2015

127 pages

ISSN:0020-0255

Issue’s Table of Contents

Copyright © Elsevier Inc.

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 20 June 2015

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

61
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang HChen RWen SBian X(2025)SWIMFuture Generation Computer Systems10.1016/j.future.2024.107590164:COnline publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1016/j.future.2024.107590
Grazian CJin QTangari G(2025)Assessing the invertibility of deep biometric representationsExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.125848264:COnline publication date: 10-Mar-2025
https://dl.acm.org/doi/10.1016/j.eswa.2024.125848
Zhou XTan GWang HMa YWu S(2025)Artificial bee colony algorithm based on multi-neighbor guidanceExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.125283259:COnline publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1016/j.eswa.2024.125283
Aharon UDubin RDvir AHajaj C(2025)A classification-by-retrieval framework for few-shot anomaly detection to detect API injectionComputers and Security10.1016/j.cose.2024.104249150:COnline publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1016/j.cose.2024.104249
Chang WZhang D(2024)A Multimodal Sentiment Analysis Model for Graphic Texts Based on Deep Feature Interaction NetworksInternational Journal of Ambient Computing and Intelligence10.4018/IJACI.35519215:1(1-19)Online publication date: 16-Oct-2024
https://dl.acm.org/doi/10.4018/IJACI.355192
Li JJin YGao HQiang WZheng CSun FWooldridge MDy JNatarajan S(2024)Hierarchical topology isomorphism expertise embedded graph contrastive learningProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i12.29255(13518-13527)Online publication date: 20-Feb-2024
https://dl.acm.org/doi/10.1609/aaai.v38i12.29255
Shafei HTan C(2024)Enhancing Alexa Skill Testing Through Improved Utterance DiscoveryACM Transactions on Internet Technology10.1145/369820024:4(1-25)Online publication date: 18-Nov-2024
https://dl.acm.org/doi/10.1145/3698200
Nour BPourzandi MQureshi RDebbabi M(2024)AUTOMA: Automated Generation of Attack Hypotheses and Their Variants for Threat Hunting Using Knowledge DiscoveryIEEE Transactions on Network and Service Management10.1109/TNSM.2024.337897221:5(5178-5196)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1109/TNSM.2024.3378972
Li JMeng YZhan YZhang LZhu H(2024)Dangers Behind Charging VR Devices: Hidden Side Channel Attacks via Charging CablesIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.346502619(8892-8907)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TIFS.2024.3465026
Wei XWang XYan YJiang NYue H(2024)ALERTJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2024.103160152:COnline publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1016/j.sysarc.2024.103160
Show More Cited By

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents