Abstract
Traditional normalized tree edit distances do not satisfy the triangle inequality. We present a metric normalization method for tree edit distance, which results in a new normalized tree edit distance fulfilling the triangle inequality, under the condition that the weight function is a metric over the set of elementary edit operations with all costs of insertions/deletions having the same weight. We prove that the new distance, in the range [0, 1], is a genuine metric as a simple function of the sizes of two ordered labeled trees and the tree edit distance between them, which can be directly computed through tree edit distance with the same complexity. Based on an efficient algorithm to represent digits as ordered labeled trees, we show that the normalized tree edit metric can provide slightly better results than other existing methods in handwritten digit recognition experiments using the approximating and eliminating search algorithm (AESA) algorithm.
Similar content being viewed by others
References
Zhang K, Shasha D, Wang J T L. Approximate tree matching in the presence of variable length don’t cares. Journal of Algorithms, 1994, 16(1): 33–66
Tai K C. The tree-to-tree correction problem. Journal of the ACM, 1979, 26(3): 422–433
Zhang K, Shasha D. Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing, 1989, 18(6): 1245–1262
Kilpeläinen P, Mannila H. Ordered and unordered tree inclusion. SIAM Journal on Computing, 1995, 24(2): 340–356
Klein P, Tirthapura S, Sharvit D, Kimia B. A tree-edit-distance algorithm for comparing simple, closed shapes. In: Proceedings of 11th Annual ACM-SIAM Symposium on Discrete Algorithms. 2000, 696–704
Hoffmann C M, O’Donnell M J. Pattern matching in trees. Journal of the ACM, 1982, 29(1): 68–95
Ramesh R, Ramakrishnan I V. Nonlinear pattern matching in trees. Journal of the ACM, 1992, 39(2): 295–316
Bille P. A survey on tree edit distance and related problems. Theoretical Computer Science, 2005, 337(1–3): 217–239
Levenshtein A. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady, 1966, 10(8): 707–710
Sellers P H. On the theory and computation of evolutionary distances. SIAM Journal of Applied Mathematics, 1974, 26(4): 787–793
Wagner R A, Fischer M J. The string-to-string correction problem. Journal of the ACM, 1974, 21(1): 168–173
Weigel A, Fein F. Normalizing the weighted edit distance. In: Proceedings of 12th International Conference on Pattern Recognition. 1994, 399–402
Yianilos P N. Normalized Forms for Two Common Metrics. Report 91-082-9027-1, revision 7. July 2002, NEC Research Inst. http://www.pnylab.com/pny/
Rico-Juan J R, Mico L. Comparison of AESA and LAESA search algorithms using string and tree-edit-distance. Pattern Recognition Letters, 2003, 24(9–10): 1417–1426
Yujian L, Bo L. A normalized Levenshtein distance metric. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(6): 1091–1095
Marzal A, Vidal E. Computation of normalized edit distance and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1993, 15(9): 926–932
Schroeder M, Schweimeier R. Argument and misunderstandings: fuzzy unification for negotiating agents. Electronic Notes in Theoretical Computer Science, 2002, 70(5): 1–19
Vidal E. New formulation and improvements of the nearest-neighbour approximating and eliminating search algorithm (AESA). Pattern Recognition Letters, 1994, 15(1): 1–7
Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278–2324
Carrasco R C, Forcada ML. A note on the Nagen-draprasad-Wang-Gupta thining algorithm. Pattern Recognition Letters, 1995, 16(5): 539–541
Kong L B, Tang S W, Yang D Q, Wang T J, Gao J. Querying Techniques for XML Data. Journal of Software, 2007, 18(6): 1400–1418 (in Chinese)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, Y., Chenguang, Z. A metric normalization of tree edit distance. Front. Comput. Sci. China 5, 119–125 (2011). https://doi.org/10.1007/s11704-011-9336-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11704-011-9336-2