Nothing Special   »   [go: up one dir, main page]

skip to main content
article

On the similarity metric and the distance metric

Published: 01 May 2009 Publication History

Abstract

Similarity and dissimilarity measures are widely used in many research areas and applications. When a dissimilarity measure is used, it is normally required to be a distance metric. However, when a similarity measure is used, there is no formal requirement. In this article, we have three contributions. First, we give a formal definition of similarity metric. Second, we show the relationship between similarity metric and distance metric. Third, we present general solutions to normalize a given similarity metric or distance metric.

References

[1]
Arslan, A.N., Eï¿ecioï¿lu, Oï and Pevzner, P.A., A new approach to sequence alignment: Normalized sequence alignment. Bioinformatics. v17 i4. 327-337.
[2]
Bunke, H. and Shearer, K., A graph distance metric based on the maximal common subgraph. Pattern Recognition Letters. v19. 255-259.
[3]
Calude, C.S., Salomaa, K. and Yu, S., Additive distances and quasi-distances between words. Journal of Universal Computer Science. v8 i2. 141-152.
[4]
S. Chen, B. Ma, K. Zhang, The normalized similarity metric and its applications, in: Proceedings of 2007 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2007, 2007, pp. 172ï¿180
[5]
Horibe, Y., Entropy and correlation. IEEE Transactions on Systems, Man, and Cybernetics. v15. 641-642.
[6]
A.J. Knobbe, P.W. Adriaans, Analysing binary associations, in: E. Simoudis, J. Han, U. Fayyad (Eds.), Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 1996, pp. 311ï¿314
[7]
Kvålseth, T.O., Entropy and correlation: Some comments. IEEE Transactions on Systems, Man, and Cybernetics. v17. 517-519.
[8]
Li, M., Chen, X., Li, X., Ma, B. and Vitányi, P.M.B., The similarity metric. IEEE Transactions on Information Theory. v50 i12. 3250-3264.
[9]
Linfoot, E.H., An informational measure of correlation. Information and Control. v1 i1. 85-89.
[10]
R. López de Mántaras, Id3 revisited: A distance-based criterion for attribute selection, in: Z. Ras (Ed.), Proceedings of the Fourth International Symposium on Methodologies for Intelligent Systems, 1989, pp. 342ï¿350
[11]
B. Ma, K. Zhang, The similarity metric and the distance metric, in: Proceedings of the 6th Atlantic Symposium on Computational Biology and Genome Informatics, 2005, pp. 1239ï¿1242
[12]
Malvestuto, F.M., Statistical treatment of the information content of a database. Information Systems. v11. 211-223.
[13]
Marzal, A. and Vidal, E., Computation of normalized edit distance and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence. v15 i9. 926-932.
[14]
Needleman, S.E. and Wunsch, C.D., A general method applicable to the search for similarities in the amino-acid sequences of two proteins. Journal of Molecular Biology. v48. 443-453.
[15]
Oommen, B.J. and Zhang, K., The normalized string editing problem revisited. IEEE Transactions on Pattern Analysis and Machine Intelligence. v18 i6. 669-672.
[16]
Quinlan, J.R., Induction of decision trees. Machine Learning. v1 i1. 81-106.
[17]
Rajski, C., A metric space of discrete probability distributions. Information and Control. v4 i4. 371-377.
[18]
S.C. Sahinalp, M. Tasan, J. Macker, Z.M. Ozsoyoglu, Distance based indexing for string proximity search, in Proceedings of the 19th International Conference on Data Engineering, 2003, pp. 125ï¿136
[19]
Saitou, N. and Nei, M., The neighbor-joining method: A new method for reconstructing phylogenetic trees. Molecular Biology and Evolution. v4. 406-425.
[20]
Smith, T.F. and Waterman, M.S., Comparison of biosequences. Advances in Applied Mathematics. v2. 482-489.
[21]
Sokal, R.R. and Michener, C.D., A statistical method for evaluating systematic relationships. University of Kansas Scientific Bulletin. v28. 1409-1438.
[22]
A. Stojmirovic, V. Pestov, Indexing schemes for similarity search in datasets of short protein fragments, ArXiv Computer Science e-prints (cs/0309005), September 2003
[23]
Studier, J.A. and Keppler, K.J., A note on the neighbor-joining algorithm of Saitou and Nei. Molecular Biology and Evolution. v5. 729-731.
[24]
Torsello, A., Hidoviï¿-Rowe, D. and Pelillo, M., Polynomial-time metrics for attributed trees. IEEE Transactions on Pattern Analysis and Machine Intelligence. v27 i7. 1087-1099.
[25]
S.J. Wan, S.K.M. Wong, A measure for concept dissimilarity and its application in machine learning, in: Proceedings of the First International Conference on Computing and Information, 1989, pp. 267ï¿273
[26]
Waterman, M.S. and Smith, T.F., Some biological sequence metrics. Advances in Mathematics. v20. 367-387.
[27]
Y.Y. Yao, S.K.M. Wong, C.J. Butz, On information-theoretic measures of attribute importance, in: N. Zhong (Ed.), Proceedings of the Third Pacific-Asia Conference on Knowledge Discovery and Data Mining, 1999, pp. 133ï¿137
[28]
Zhang, K. and Shasha, D., Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing. v18 i6. 1245-1262.

Cited By

View all
  • (2024)From Minimum Change to Maximum Density: On Determining Near-Optimal S-RepairIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.329440136:2(627-639)Online publication date: 1-Feb-2024
  • (2023)The complexity of financial wellness: examining survey patterns via kernel metric learning and clustering of mixed-type dataProceedings of the Fourth ACM International Conference on AI in Finance10.1145/3604237.3626849(314-322)Online publication date: 27-Nov-2023
  • (2022)Measuring information gain using provenanceProceedings of the 14th International Workshop on the Theory and Practice of Provenance10.1145/3530800.3534534(1-4)Online publication date: 17-Jun-2022
  • Show More Cited By

Index Terms

  1. On the similarity metric and the distance metric
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    Publisher

    Elsevier Science Publishers Ltd.

    United Kingdom

    Publication History

    Published: 01 May 2009

    Author Tags

    1. Distance metric
    2. Normalized distance metric
    3. Normalized similarity metric
    4. Similarity metric

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 26 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)From Minimum Change to Maximum Density: On Determining Near-Optimal S-RepairIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.329440136:2(627-639)Online publication date: 1-Feb-2024
    • (2023)The complexity of financial wellness: examining survey patterns via kernel metric learning and clustering of mixed-type dataProceedings of the Fourth ACM International Conference on AI in Finance10.1145/3604237.3626849(314-322)Online publication date: 27-Nov-2023
    • (2022)Measuring information gain using provenanceProceedings of the 14th International Workshop on the Theory and Practice of Provenance10.1145/3530800.3534534(1-4)Online publication date: 17-Jun-2022
    • (2022)Epistemic Logic via Distance and SimilarityPRICAI 2022: Trends in Artificial Intelligence10.1007/978-3-031-20862-1_3(32-45)Online publication date: 10-Nov-2022
    • (2021)NORESQAProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3541974(22363-22378)Online publication date: 6-Dec-2021
    • (2021)Weighted Modal Logic in Epistemic and Deontic ContextsLogic, Rationality, and Interaction10.1007/978-3-030-88708-7_6(73-87)Online publication date: 16-Oct-2021
    • (2020)Partial similarity measure of uncertain random variables and its application to portfolio selectionJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-19094239:1(155-166)Online publication date: 1-Jan-2020
    • (2020)A Similarity Function for HTML ListsProceedings of the Brazilian Symposium on Multimedia and the Web10.1145/3428658.3430963(309-316)Online publication date: 30-Nov-2020
    • (2020)A heuristic fuzzy algorithm for assessing and managing tourism sustainabilitySoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-019-04170-524:6(4027-4040)Online publication date: 1-Mar-2020
    • (2020)Accelerated Design of HIFU Treatment Plans Using Island-Based Evolutionary StrategyApplications of Evolutionary Computation10.1007/978-3-030-43722-0_30(463-478)Online publication date: 15-Apr-2020
    • Show More Cited By

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media