Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Fast Near-Duplicate Image Detection Using Uniform Randomized Trees

Published: 04 July 2014 Publication History

Abstract

Indexing structure plays an important role in the application of fast near-duplicate image detection, since it can narrow down the search space. In this article, we develop a cluster of uniform randomized trees (URTs) as an efficient indexing structure to perform fast near-duplicate image detection. The main contribution in this article is that we introduce “uniformity” and “randomness” into the indexing construction. The uniformity requires classifying the object images into the same scale subsets. Such a decision makes good use of the two facts in near-duplicate image detection, namely: (1) the number of categories is huge; (2) a single category usually contains only a small number of images. Therefore, the uniform distribution is very beneficial to narrow down the search space and does not significantly degrade the detection accuracy. The randomness is embedded into the generation of feature subspace and projection direction, improveing the flexibility of indexing construction. The experimental results show that the proposed method is more efficient than the popular locality-sensitive hashing and more stable and flexible than the traditional KD-tree.

References

[1]
A. Andoni and P. Indyk. 2008. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Comm. ACM 51, 1, 117--122.
[2]
D. N. Bhat and S. K. Nayar. 1998. Ordinal measures for image correspondence. IEEE Trans. Pattern Anal. Mach. Intell. 20, 4, 415--423.
[3]
A. Bosch, A. Zisserman, and X. Muoz. 2007. Image classification using random forests and ferns. In Proceedings of the IEEE International Conference on Computer Vision. 1--8.
[4]
Y. Cao, H. Zhang, and J. Guo. 2011. Weakly supervised locality sensitive hashing for duplicate image retrieval. In Proceedings of the IEEE International Conference on Image Processing. 2461--2464.
[5]
E. Chang, J. Wang, C. Li, and G. Wiederhold. 1998. Rime: A replicated image detector for the world-wide web. In SPIE Multimedia Storage and Archiving System III, Vol. 3527, 58--67.
[6]
O. Chum, J. Philbin, M. Isard, and A. Zisserman. 2007. Scalable near identical image and shot detection. In Proceedings of the ACM International Conference on Image and Video Retrieval. 549--556.
[7]
M. Douze, H. Jegou, H. Sandhawalia, L. Amsaleg, and C. Schmid. 2009. Evaluation of gist descriptors for web-scale image search. In Proceedings of the ACM International Conference on Image and Video Retrieval. Vol. 19. 1--8.
[8]
Y. Hu, M. Li, and N. Yu. 2008. Efficient near-duplicate image detection by learning from examples. In Proceedings of the IEEE International Conference on Multimedia and Expo. 657--660.
[9]
M. J. Huiskes and M. S. Lew. 2008. The MIR Flickr retrieval evaluation. In Proceedings of the ACM International Conference on Multimedia Information Retrieval. 39--43.
[10]
H. Jegou, F. Perronnin, M. Douze, J. Sanchez, P. Perez, and C. Schmid. 2012. Aggregating local images descriptors into compact codes. IEEE Trans. Pattern Anal. Mach. Intell. 34, 9, 1704--1716.
[11]
A. Joly, O. Buisson, and C. Frelicot. 2007. Content-based copy retrieval using distortion-based probabilistic similarity search. IEEE Trans. Multimedia 9, 2, 293--306.
[12]
Y. Ke, R. Sukthankar, L. Huston, Y. Ke, and R. Sukthankar. 2004. Efficient near-duplicate detection and sub-image retrieval. In Proceedings of the ACM International Conference on Multimedia. 869--876.
[13]
C. Kim. 2003. Content-based image copy detection. Signal Process. Image Comm. 18, 3, 169--184.
[14]
Y. Lei, Y. Wang, and J. Huang. 2011. Robust image hash in radon transform domain for authentication. Signal Process. Image Comm. 26, 6, 280--288.
[15]
V. Lepetit and P. Fua. 2006. Keypoint recognition using randomized trees. IEEE Trans. Pattern Anal. Mach. Intell. 28, 9, 1465--1479.
[16]
D. G. Lowe. 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 2, 91--110.
[17]
A. Oliva and A. Torralba. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 3, 145--175.
[18]
A. Qamra, Y. Meng, and E. Y. Chang. 2005. Enhanced perceptual distance functions and indexing for image replica recognition. IEEE Trans. Pattern Anal. Mach. Intell. 27, 3, 379--391.
[19]
G. Qiu, J. Morris, and X. Fan. 2007. Visual guided navigation for image retrieval. Pattern Recogn. 40, 6, 1711--1721.
[20]
J. Ramirez, J. M. Gorriz, R. Chaves, M. Lopez, D. Salas-Gonzalez, I. Alvarez, and F. Segovia. 2009. SPECT image classification using random forests. Electron. Lett. 45, 12, 604--605.
[21]
B. C. Russell, A. Torralba, K. P. Murphy, and W. T. Freeman. 2008. LabelMe: A database and web-based tool for image annotation. Int. J. Comput. Vis. 77, 1--3, 157--173.
[22]
G. Shakhnarovich. 2008. The source code of locality sensitive hashing. http://www.ttic.edu/gregory.
[23]
J. Sivic and A. Zisserman. 2003. Video google: A text retrieval approach to object matching in videos. In Proceedings of the IEEE International Conference on Computer Vision. 1470--1477.
[24]
M.-N. Wu, C.-C. Lin, and C.-C. Chang. 2007. Novel image copy detection with rotating tolerance. J. Syst. Softw. 80, 7, 1057--1069.
[25]
Z. Wu, Q. Ke, M. Isard, and J. Sun. 2009. Bundling features for large scale partial-duplicate web image search. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 25--32.
[26]
Z. Wu, Q. Xu, S. Jiang, Q. Huang, P. Cui, and L. Li. 2010. Adding affine invariant geometric constraint for partial-duplicate image retrieval. In Proceedings of the IEEE International Conference on Pattern Recognition. 842--845.
[27]
Z. Xu, H. Ling, F. Zou, Z. Lu, and P. Li. 2010. Robust image copy detection using multi-resolution histogram. In Proceedings of the ACM International Conference on Multimedia Information Retrieval. 129--136.
[28]
G. Yu, N. A. Goussies, J. Yuan, and Z. Liu. 2011. Fast action detection via discriminative random forest voting and top-k subvolume search. IEEE Trans. Multimedia 13, 3, 507--517.
[29]
L. Zheng, Y. Lei, G. Qiu, and J. Huang. 2012. Near-duplicate image detection in a visually salient riemannian space. IEEE Trans. Inf. Forens. Secur. 7, 5, 1578--1593.

Cited By

View all
  • (2024)Image deduplication using efficient visual indexing and retrieval: optimizing storage, time and energy for deep neural network trainingSignal, Image and Video Processing10.1007/s11760-024-03562-x18:12(9495-9503)Online publication date: 14-Sep-2024
  • (2024)Redundancy elimination in IoT oriented big data: a survey, schemes, open challenges and future applicationsCluster Computing10.1007/s10586-023-04209-127:1(1063-1087)Online publication date: 1-Feb-2024
  • (2022)Deep CNN based online image deduplication technique for cloud storage systemMultimedia Tools and Applications10.1007/s11042-022-13182-781:28(40793-40826)Online publication date: 1-Nov-2022
  • Show More Cited By

Index Terms

  1. Fast Near-Duplicate Image Detection Using Uniform Randomized Trees

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 10, Issue 4
      June 2014
      132 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/2656131
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 04 July 2014
      Accepted: 01 March 2014
      Revised: 01 January 2014
      Received: 01 May 2013
      Published in TOMM Volume 10, Issue 4

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Fast near-duplicate image detection
      2. indexing structure
      3. uniform randomized tree

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 15 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Image deduplication using efficient visual indexing and retrieval: optimizing storage, time and energy for deep neural network trainingSignal, Image and Video Processing10.1007/s11760-024-03562-x18:12(9495-9503)Online publication date: 14-Sep-2024
      • (2024)Redundancy elimination in IoT oriented big data: a survey, schemes, open challenges and future applicationsCluster Computing10.1007/s10586-023-04209-127:1(1063-1087)Online publication date: 1-Feb-2024
      • (2022)Deep CNN based online image deduplication technique for cloud storage systemMultimedia Tools and Applications10.1007/s11042-022-13182-781:28(40793-40826)Online publication date: 1-Nov-2022
      • (2021)Identifying forged seal imprints using positive and unlabeled learningMultimedia Tools and Applications10.1007/s11042-020-10171-680:20(30761-30773)Online publication date: 1-Aug-2021
      • (2019)Original image tracing with image relational graph for near-duplicate image eliminationInternational Journal of Computational Science and Engineering10.5555/3337494.333750418:3(294-304)Online publication date: 25-May-2019
      • (2018)Intersection Traffic Prediction Using Decision Tree ModelsSymmetry10.3390/sym1009038610:9(386)Online publication date: 7-Sep-2018
      • (2018)Deep Constrained Siamese Hash Coding Network and Load-Balanced Locality-Sensitive Hashing for Near Duplicate Image DetectionIEEE Transactions on Image Processing10.1109/TIP.2018.283988627:9(4452-4464)Online publication date: Sep-2018
      • (2018)Processing Social Media Images by Combining Human and Machine Computing during CrisesInternational Journal of Human–Computer Interaction10.1080/10447318.2018.142783134:4(311-327)Online publication date: 26-Jan-2018
      • (2018)Data deduplication techniques for efficient cloud storage managementThe Journal of Supercomputing10.1007/s11227-017-2210-874:5(2035-2085)Online publication date: 1-May-2018
      • (2016)Fast near-duplicate detection from image streams on online social media during disaster events2016 IEEE Annual India Conference (INDICON)10.1109/INDICON.2016.7839137(1-6)Online publication date: Dec-2016
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media