Abstract
The Tanimoto coefficient has previously been proven to be a metric, but only in the case of binary valued vectors. Moreover, it has been proven that the Tanimoto coefficient for real valued vectors is not a metric. This means that it is not immediately possible to use metric based data structures for accelerating Tanimoto queries. This note presents a method for transforming Tanimoto queries into range queries in Euclidian space, making it possible to use metric data structures, as well as data structures designed for Euclidian space.
Similar content being viewed by others
References
Willett P., Barnard J.M., Downs G.M.: Chemical similarity searching. J. Chem. Inf. Comput. Sci. 38(6), 983–996 (1998)
Swamidass S.J., Baldi P.: Bounds and algorithms for fast exact searches of chemical fingerprints in linear and sublinear time. J. Chem. Inf. Model. 47(2), 302–317 (2007)
Baldi P., Hirschberg D.S., Nasr R.J.: Speeding up chemical database searches using a proximity filter based on the logical exclusive or. J. Chem. Inf. Model. 48(7), 1367–1378 (2008)
Kristensen T.G., Nielsen J., Pedersen C.N.S.: A tree-based method for the rapid screening of chemical fingerprints. Algorithms Mol Biol 5(1), 9 (2010)
Späth H.: Cluster Analysis Algorithms for Data Reduction and Classification of Objects. Ellis Horwood, Chicester (1980)
Lipkus A.H.: A proof of the triangle inequality for the tanimoto distance. J. Math. Chem. 26(1–3), 263–265 (1999)
Xu H., Agrafiotis D.K.: Nearest neighbor search in general metric spaces using a tree data structure with a simple heuristic. J. Chem. Inf. Model. 43(6), 1933–1941 (2003)
P.N. Yianilos, Data structures and algorithms for nearest neighbor search in general metric spaces. In Proceedings of the Fourth ACM-SIAM Symposium on Discrete Algorithms (1993)
P. Ciaccia, M. Patella, P. Zezula, M-tree: An efficient access method for similarity search in metric spaces. In VLDB’97: Proceedings of 23rd International Conference on Very Large Data Bases (August 25–29, 1997, Athens, Greece), ed. by M. Jarke, M.J. Carey, K.R. Dittrich, F.H. Lochovsky, P. Loucopoulos, M.A. Jeusfeld (Morgan Kaufmann, 1997), pp. 426–435
S. Brin, Near neighbor search in large metric spaces. VLDB J. 574–584 (1995)
Bentley J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kristensen, T.G. Transforming Tanimoto queries on real valued vectors to range queries in Euclidian space. J Math Chem 48, 287–289 (2010). https://doi.org/10.1007/s10910-010-9668-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10910-010-9668-4