Abstract
Compression-based pattern recognition measures the similarity between objects with relying on data compression techniques. This paper improves the current compression-based pattern recognition by exploiting new useful features which are easy to obtain. In particular, we study the two known methods called PRDC (Pattern Representation on Data Compression) and NMD (Normalized Compression Distance). PRDC represents an object x with a feature vector that lines up the compression ratios derived by compressing x with multiple dictionaries. We smartly enhance PRDC by extracting new novel features from the compressed files. NMD measures the similarity between two objects by comparing their compression dictionaries. We extend NMD by incorporating the length of words in the dictionaries into the similarity measure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Li, M., Chen, X., Li, X., Ma, B., Vitanyi, P.: The similarity metric. IEEE Trans. Inf. Theor. 50(12), 3250–3264 (2004)
Watanabe, T., Sugawara, K., Sugihara, H.: A new pattern representation scheme using data compression. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 579–590 (2002)
Welch, T.A.: A technique for high-performance data compression. Computer 17(6), 8–19 (1984)
Macedonas, A., Besiris, D., Economou, G., Fotopoulos, S.: Dictionary based color image retrieval. J. Vis. Commun. Image Represent. 19(7), 464–470 (2008)
Cerra, D., Datcu, M.: A fast compression-based similarity measure with applications to content-based image retrieval. J. Vis. Commun. Image Represent. 23(2), 293–302 (2012)
Besiris, D., Zigouris, E.: Dictionary-based color image retrieval using multiset theory. J. Vis. Commun. Image Represent. 24(7), 1155–1167 (2013)
Cilibrasi, R., Vitányi, P., De Wolf, R.: Algorithmic clustering of music based on string compression. Comput. Music J. 28(4), 49–67 (2004)
Cerra, D., Datcu, M.: Expanding the algorithmic information theory frame for applications to earth observation. Entropy 15(1), 407–415 (2013)
Hagenauer, J., Mueller, J.: Genomic analysis using methods from information theory. In: Proceedings of IEEE Information Theory Workshop, pp. 55–59 (2004)
Cilibrasi, R.: Statistical inference through data compression. Ph.D. thesis, Institute for Logic, language and Computation, Universiteit van Amsterdam (2007)
Koga, H., Nakajina, Y., Toda, T.: Effective construction of compression-based feature space. In: Proceedings of International Symposium on Information Theory and Its Applications (ISITA 2016), pp. 116–120 (2016)
Wang, J., Li, J., Wiederhold, G.: Simplicity: semantics-sensitive integrated matching for picture libraries. IEEE Trans. Pattern Anal. Mach. Intell. 23, 947–963 (2001)
Acknowledgments
This work was supported by JSPS KAKENHI Grant Number JP15K00148, 2016.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Uchino, T., Koga, H., Toda, T. (2017). Improved Compression-Based Pattern Recognition Exploiting New Useful Features. In: Alexandre, L., Salvador Sánchez, J., Rodrigues, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2017. Lecture Notes in Computer Science(), vol 10255. Springer, Cham. https://doi.org/10.1007/978-3-319-58838-4_40
Download citation
DOI: https://doi.org/10.1007/978-3-319-58838-4_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58837-7
Online ISBN: 978-3-319-58838-4
eBook Packages: Computer ScienceComputer Science (R0)