research-article

Large-Scale Discovery of Spatially Related Images

Authors:

Ondrej Chum,

Jirí MatasAuthors Info & Claims

IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 32, Issue 2

Pages 371 - 377

https://doi.org/10.1109/TPAMI.2009.166

Published: 01 February 2010 Publication History

Publisher Site

Abstract

We propose a randomized data mining method that finds clusters of spatially overlapping images. The core of the method relies on the min-Hash algorithm for fast detection of pairs of images with spatial overlap, the so-called cluster seeds. The seeds are then used as visual queries to obtain clusters which are formed as transitive closures of sets of partially overlapping images that include the seed. We show that the probability of finding a seed for an image cluster rapidly increases with the size of the cluster. The properties and performance of the algorithm are demonstrated on data sets with 10^4, 10^5, and 5 \times 10^6 images. The speed of the method depends on the size of the database and the number of clusters. The first stage of seed generation is close to linear for databases sizes up to approximately 2^{34} \approx 10^{10} images. On a single 2.4 GHz PC, the clustering process took only 24 minutes for a standard database of more than 100,000 images, i.e., only 0.014 seconds per image.

Cited By

View all

Dong QLiu BZhang XQin JWang B(2023)Sequential POI Recommend Based on Personalized Federated LearningNeural Processing Letters10.1007/s11063-023-11264-755:6(7351-7368)Online publication date: 13-Apr-2023
https://dl.acm.org/doi/10.1007/s11063-023-11264-7
Alzou’bi AGan K(2022)Discovering informative features in large-scale landmark image collectionJournal of Information Science10.1177/016555152095065348:2(237-250)Online publication date: 1-Apr-2022
https://dl.acm.org/doi/10.1177/0165551520950653
Talat RMuzammal MShan R(2020)A decentralised approach to scene completion using distributed feature hashgramMultimedia Tools and Applications10.1007/s11042-019-08403-579:15-16(9799-9817)Online publication date: 1-Apr-2020
https://dl.acm.org/doi/10.1007/s11042-019-08403-5
Show More Cited By

Index Terms

Large-Scale Discovery of Spatially Related Images
1. Computing methodologies

Recommendations

A non-parametric unsupervised approach for content based image retrieval and clustering
ARTEMIS '13: Proceedings of the 4th ACM/IEEE international workshop on Analysis and retrieval of tracked events and motion in imagery stream

Nowadays, there are available extremely large collections of images located on distributed and heterogeneous platforms over the web. The proliferation of billions of shared photos has outpaced the current technology for browsing such collections, but at ...
Online image search result grouping with MapReduce-based image clustering and graph construction for large-scale photos

Current image search system uses paged image list to show search results. However, the problems such as query ambiguity make users hard to find search targets in such image list. In this work, we propose an image search result grouping system that ...
Clustering near-duplicate images in large collections
MIR '07: Proceedings of the international workshop on Workshop on multimedia information retrieval

Near-duplicate images introduce problems of redundancy and copyright infringement in large image collections. The problem is acute on the web, where appropriation of images without acknowledgment of source is prevalent. In this paper, we present an ...

Reviews

Reviewer: Dick Brodine

How do you retrieve all of the images in an image database that are similar to an image that you have chosen__?__ To date, most retrieval algorithms gather images that are labeled with identical text tags. Chum and Matas' paper presents an algorithm that matches images on the basis of image content. Images are summarized by an image descriptor vector, and mini-hash, a hashing algorithm, is applied to each descriptor. Descriptors with identical hash values are placed in the same bin. Similarity is calculated for each combination of descriptors in a bin. Image pairs with similarities greater than a certain level (threshold) are forwarded to a spatial consistency test. The spatial consistency test performs a comparison of the image feature geometries. Image pairs that pass spatial consistency are classified as cluster seeds. The cluster seeds are employed to amass other images that resemble the seed into the cluster, thereby grouping similar images. When the user supplies an image as part of a query, that image is compared to each of the known clusters. If a match is found, the images in that cluster are returned to the user. This paper should be of interest to researchers in the field of image processing. It assumes a fair knowledge of statistical image representation-that is, visual words-on the part of the reader. That being said, the paper cites a number of references on this concept and other related topics. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Pattern Analysis and Machine Intelligence

IEEE Transactions on Pattern Analysis and Machine Intelligence Volume 32, Issue 2

February 2010

192 pages

ISSN:0162-8828

Issue’s Table of Contents

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 February 2010

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

40
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Dong QLiu BZhang XQin JWang B(2023)Sequential POI Recommend Based on Personalized Federated LearningNeural Processing Letters10.1007/s11063-023-11264-755:6(7351-7368)Online publication date: 13-Apr-2023
https://dl.acm.org/doi/10.1007/s11063-023-11264-7
Alzou’bi AGan K(2022)Discovering informative features in large-scale landmark image collectionJournal of Information Science10.1177/016555152095065348:2(237-250)Online publication date: 1-Apr-2022
https://dl.acm.org/doi/10.1177/0165551520950653
Talat RMuzammal MShan R(2020)A decentralised approach to scene completion using distributed feature hashgramMultimedia Tools and Applications10.1007/s11042-019-08403-579:15-16(9799-9817)Online publication date: 1-Apr-2020
https://dl.acm.org/doi/10.1007/s11042-019-08403-5
Wang QZhou YZhu ZLiang XGu Y(2020)Jointly Discriminating and Frequent Visual Representation MiningComputer Vision – ACCV 202010.1007/978-3-030-69535-4_22(356-371)Online publication date: 30-Nov-2020
https://dl.acm.org/doi/10.1007/978-3-030-69535-4_22
Wang QZhu ZLiang XShi HCao P(2020)Deep Relevance Feature Clustering for Discovering Visual Representation of Tourism DestinationPattern Recognition and Computer Vision10.1007/978-3-030-60636-7_28(332-341)Online publication date: 16-Oct-2020
https://dl.acm.org/doi/10.1007/978-3-030-60636-7_28
Liu CYu GChang CRai HMa JGorti SVolkovs MWallach HLarochelle HBeygelzimer Ad'Alché-Buc FFox E(2019)Guided similarity separation for image retrievalProceedings of the 33rd International Conference on Neural Information Processing Systems10.5555/3454287.3454426(1556-1566)Online publication date: 8-Dec-2019
https://dl.acm.org/doi/10.5555/3454287.3454426
Wang QLai JYang ZXu KKan PLiu WLei L(2019)Improving cross-dimensional weighting pooling with multi-scale feature fusion for image retrievalNeurocomputing10.1016/j.neucom.2019.08.025363:C(17-26)Online publication date: 21-Oct-2019
https://dl.acm.org/doi/10.1016/j.neucom.2019.08.025
A PNair NRastogi RCuzzocrea AAllan JPaton NSrivastava DAgrawal RBroder AZaki MCandan SLabrinidis ASchuster AWang H(2018)A Scalable Algorithm for Higher-order Features Generation using MinHashProceedings of the 27th ACM International Conference on Information and Knowledge Management10.1145/3269206.3271752(1213-1222)Online publication date: 17-Oct-2018
https://dl.acm.org/doi/10.1145/3269206.3271752
Li WLi JWang CZhang LZhang B(2018)Visual instance mining from the graph perspectiveMultimedia Systems10.1007/s00530-016-0533-624:2(147-162)Online publication date: 1-Mar-2018
https://dl.acm.org/doi/10.1007/s00530-016-0533-6
Iclanzan DSzilágyi SSzilágyi L(2018)Evolving Computationally Efficient Hashing for Similarity SearchNeural Information Processing10.1007/978-3-030-04179-3_49(552-563)Online publication date: 13-Dec-2018
https://dl.acm.org/doi/10.1007/978-3-030-04179-3_49
Show More Cited By

Abstract

Cited By

Index Terms

Recommendations

A non-parametric unsupervised approach for content based image retrieval and clustering

Online image search result grouping with MapReduce-based image clustering and graph construction for large-scale photos

Clustering near-duplicate images in large collections

Reviews

Access critical reviews of Computing literature here

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations