Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Large-Scale Discovery of Spatially Related Images

Published: 01 February 2010 Publication History

Abstract

We propose a randomized data mining method that finds clusters of spatially overlapping images. The core of the method relies on the min-Hash algorithm for fast detection of pairs of images with spatial overlap, the so-called cluster seeds. The seeds are then used as visual queries to obtain clusters which are formed as transitive closures of sets of partially overlapping images that include the seed. We show that the probability of finding a seed for an image cluster rapidly increases with the size of the cluster. The properties and performance of the algorithm are demonstrated on data sets with 10^4, 10^5, and 5 \times 10^6 images. The speed of the method depends on the size of the database and the number of clusters. The first stage of seed generation is close to linear for databases sizes up to approximately 2^{34} \approx 10^{10} images. On a single 2.4 GHz PC, the clustering process took only 24 minutes for a standard database of more than 100,000 images, i.e., only 0.014 seconds per image.

Cited By

View all

Recommendations

Reviews

Dick Brodine

How do you retrieve all of the images in an image database that are similar to an image that you have chosen__?__ To date, most retrieval algorithms gather images that are labeled with identical text tags. Chum and Matas' paper presents an algorithm that matches images on the basis of image content. Images are summarized by an image descriptor vector, and mini-hash, a hashing algorithm, is applied to each descriptor. Descriptors with identical hash values are placed in the same bin. Similarity is calculated for each combination of descriptors in a bin. Image pairs with similarities greater than a certain level (threshold) are forwarded to a spatial consistency test. The spatial consistency test performs a comparison of the image feature geometries. Image pairs that pass spatial consistency are classified as cluster seeds. The cluster seeds are employed to amass other images that resemble the seed into the cluster, thereby grouping similar images. When the user supplies an image as part of a query, that image is compared to each of the known clusters. If a match is found, the images in that cluster are returned to the user. This paper should be of interest to researchers in the field of image processing. It assumes a fair knowledge of statistical image representation-that is, visual words-on the part of the reader. That being said, the paper cites a number of references on this concept and other related topics. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Pattern Analysis and Machine Intelligence  Volume 32, Issue 2
February 2010
192 pages

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 February 2010

Author Tags

  1. bag of words.
  2. image clustering
  3. image retrieval
  4. minHash

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Sequential POI Recommend Based on Personalized Federated LearningNeural Processing Letters10.1007/s11063-023-11264-755:6(7351-7368)Online publication date: 13-Apr-2023
  • (2022)Discovering informative features in large-scale landmark image collectionJournal of Information Science10.1177/016555152095065348:2(237-250)Online publication date: 1-Apr-2022
  • (2020)A decentralised approach to scene completion using distributed feature hashgramMultimedia Tools and Applications10.1007/s11042-019-08403-579:15-16(9799-9817)Online publication date: 1-Apr-2020
  • (2020)Jointly Discriminating and Frequent Visual Representation MiningComputer Vision – ACCV 202010.1007/978-3-030-69535-4_22(356-371)Online publication date: 30-Nov-2020
  • (2020)Deep Relevance Feature Clustering for Discovering Visual Representation of Tourism DestinationPattern Recognition and Computer Vision10.1007/978-3-030-60636-7_28(332-341)Online publication date: 16-Oct-2020
  • (2019)Guided similarity separation for image retrievalProceedings of the 33rd International Conference on Neural Information Processing Systems10.5555/3454287.3454426(1556-1566)Online publication date: 8-Dec-2019
  • (2019)Improving cross-dimensional weighting pooling with multi-scale feature fusion for image retrievalNeurocomputing10.1016/j.neucom.2019.08.025363:C(17-26)Online publication date: 21-Oct-2019
  • (2018)A Scalable Algorithm for Higher-order Features Generation using MinHashProceedings of the 27th ACM International Conference on Information and Knowledge Management10.1145/3269206.3271752(1213-1222)Online publication date: 17-Oct-2018
  • (2018)Visual instance mining from the graph perspectiveMultimedia Systems10.1007/s00530-016-0533-624:2(147-162)Online publication date: 1-Mar-2018
  • (2018)Evolving Computationally Efficient Hashing for Similarity SearchNeural Information Processing10.1007/978-3-030-04179-3_49(552-563)Online publication date: 13-Dec-2018
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media