Multiple complementary inverted indexing based on multiple metrics

Kai Zhang¹,
Wengang Zhou¹,
Shaoyan Sun¹ &
…
Bin Li¹

214 Accesses
2 Citations
Explore all metrics

Abstract

Inverted indexing based on vector quantization has been a popular technique in large scale information retrieval. With vector quantization based on a certain similarity metric, the sample space is partitioned into some voronoi cells, and samples in each cell are indexed by an inverted list. The nearest neighbors of a query are efficiently identified by looking up the cell where the query is located. To improve the recall, the sample space partitioning has been performed multiple times with different initializations of k-means to build multiple inverted indexes. While with the single similarity metric, e.g., Euclidean distance, high correlation may exist between multiple inverted indexes, which constrains the possible gain in recall. A new multiple inverted indexing method based on multiple sample space partitioning with multiple different similarity metrics is presented in this paper. Furthermore, several techniques for defining multiple metrics are investigated empirically. Experiments are conducted on 3 representative datasets, million-scale SIFT and GIST feature sets and a deep-learning-produced feature set, to properly evaluate the effectiveness of the proposed method. Experiment results show that the proposed method has competitive performance compared with the state-of-the-art inverted indexing methods in terms of recall and retrieval time, and the Latin-Hypercube weighting method can generate better diverse multiple metrics and get better gain in recall.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Overview of the SISAP 2023 Indexing Challenge

Effective product quantization-based indexing for nearest neighbor search

Article 04 May 2018

Large-scale high-dimensional indexing by sparse hashing with l ₀ approximation

Article 02 December 2016

Notes

References

Anh NT, Yusuke M, Toshihiko Y, Kiyoharu A (2015) Selective K-means tree search. ACM Multimedia Conference
Artem B, Victor L (2015) The inverted multi-index. IEEE Trans Pattern Anal Mach Intell 37:1247–1260
Article Google Scholar
Aude O, Antonio T (2001) Modeling the shape of the scene: A holistic representation of the spatial envelope. Springer Int J Comput Vision 42:145–175
Article Google Scholar
Babenko A, Lempitsky V (2012) The inverted multi-index. In: IEEE conference on computer vision and pattern recognition
Charikar MS (2002) Similarity estimation techniques from rounding algorithms. In: 34th ACM symposium on theory of computing. ACM
Dasgupta S, Stevens CF, Navlakha S (2017) A neural algorithm for a fundamental computing problem[J]. Science 358(6364):793–796
Article MathSciNet Google Scholar
David N, Henrik S (2006) Scalable recognition with a vocabulary tree. In: IEEE conference on computer vision and pattern recognition
David N, Michal B, Pavel Z (2015) Large-scale image retrieval using neural net descriptors. In: International conference on research on development in information retrieval
Defu C, Dongyuan S, Jinfu C (2014) Probabilistic load flow computation using Copula and Latin hypercube sampling. IET Gener Transm Distrib 8:1539–1549
Article Google Scholar
Dong W, Huchuan LU, Ziyang X, Ming-Hsuan Y (2015) Inverse sparse tracker with a locally weighted distance metric. IEEE IEEE Trans Image Process 24:2646–2657
Article MathSciNet Google Scholar
Edgar C, Gonzalo N, Ricardo B-Y, Marroquin JL (2001) Searching in metric spaces. ACM Comput Surv 33:273–321
Article Google Scholar
Ge T, He K, Ke Q, Sun J (2014) Optimized product quantization. IEEE Trans Pattern Anal Mach Intell 36(4):744–755
Article Google Scholar
Gong Y, Lazebnik S (2011) Iterative quantization: a procrustean approach to learning binary codes. In: IEEE international conference on computer vision and pattern recognition
Gray RM (1984) Vector quantization. IEEE Signal Process Mag 1:4–29
Google Scholar
Haiming L, Dawei S, Stefan R, Rui HU, Victoria U (2008) Comparing dissimilarity measures for content-based image retrieval. Springer Asia Information Retrieval Symposium, pp 44–50
Han YU, Chung C Y, Wong K P, Lee H W, Zhang J H (2009) Probabilistic load flow evaluation with hybrid latin hypercube sampling and cholesky decomposition. IEEE Trans Power Syst 24:661–667
Article Google Scholar
He K, Fang W, Jian S (2013) K-means hashing: an affinity-preserving quantization method for learning binary compact codes. In: IEEE conference on computer vision and pattern recognition
Helton J. C., Davis F. J. (2003) Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems. Elsevier Reliab Eng Syst Safety 81:23–69
Article Google Scholar
Herve J, Matthijs D, Cordelia S (2011) Product quantization for nearest neighbor search. IEEE Trans Pattern Anal Mach Intell 33:117–128
Article Google Scholar
Herve J, Romain T, Matthijs D, Laurent A (2011) Searching in one billion vectors: re-rank with source coding. In: IEEE international conference on acoustics speech and signal processing
Hoi SCH, Wei L, Lyu MR, Ma W-Y (2006) Learning distance metrics with contextual constraints for image retrieval. In: IEEE conference on computer vision and pattern recognition
Hoi SCH, Wei L, Shih-Fu C (2008) Semi-supervised distance metric learning for collaborative image retrieval. In: IEEE conference on computer vision and pattern recognition
Jia Y, Shelhamer E, Donahue J, et al (2014) Caffe: convolutional architecture for fast feature embedding[C]. In: Proceedings of the 22nd ACM international conference on multimedia. ACM
Josef S, Andrew Z (2003) Video Google: a text retrieval approach to object matching in videos. In: IEEE international conference on computer vision
Kalantidis Y, Avrithis Y (2014) Locally optimized product quantization for approximate nearest neighbor search. In: IEEE conference on computer vision and pattern recognition. IEEE Computer Society, pp 2329–2336
Keinosuke F, Narendra Patrenahalli M (1975) A branch and bound algorithm for computing k-nearest neighbors. IEEE Trans Comput 100:750–753
MATH Google Scholar
Kevin L, Huei-Fang Y, Kuan-Hsien L, Jen-Hao H, Chu-Song C (2015) Rapid clothing retrieval via deep learning of binary codes and hierarchical search. In: ACM international conference on multimedia retrieval
Kulis B, Grauman K (2010) Kernelized locality-sensitive hashing for scalable image search. In: IEEE international conference on computer vision
Kyriakidis PC (2005) Sequential spatial simulation using latin hypercube sampling. Springer Quant Geol Geostat 14:65–74
Article Google Scholar
Lei Z, Yongdong Z, Jinhu T, Ke L, Qi T (2013) Binary code ranking with weighted hamming distance. In: IEEE conference on computer vision and pattern recognition
Lejsek H, Ásmundsson FH, Jónsson B (2009) NV-Tree: an efficient disk-based index for approximate search in very large high-dimensional collections[J]. IEEE Trans Pattern Anal Mach Intell 31(5):869–883
Article Google Scholar
Liang Z, Shengjin W, Ziqiong L, Qi T (2013) Lp-norm idf for large scale image search. In: IEEE conference on computer vision and pattern recognition
Liang Z, Shengjin W, Wengang Z, Qi T (2014) Bayes merging of multiple vocabularies for scalable image retrieval. In: IEEE conference on computer vision and pattern recognition
Loic P, Herve J, Laurent A (2010) Locality sensitive hashing: a comparison of hash function types and querying mechanisms. Elsevier Pattern Recogn Lett 31:1348–1358
Article Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Springer Int J Comput Vision 60:91–110
Article Google Scholar
Marius M, Lowe DG (2009) Fast approximate nearest neighbors with automatic algorithm configuration. In: International conference on computer vision theory and application
McKay MD, Beckman RJ, William C (1979) Comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21:239–245
MathSciNet MATH Google Scholar
Oren B, Eli S, Michal I (2008) In defense of nearest-neighbor based image classification. In: IEEE conference on computer vision and pattern recognition
Padmashree D, Jagadeesh P, Anita K (2016) An image retrieval using combined approach wavelets and local binary pattern. In: IEEE international conference on information and automation
Shen F, Shen C, Liu W, et al (2015) Supervised discrete hashing. In: IEEE international conference on computer vision and pattern recognition (CVPR)
Simone S, Ramesh J (1999) Similarity measures. IEEE Trans Pattern Anal Mach Intell 21:871–883
Article Google Scholar
Sravanthi B, Davis LS (2016) Semantic binary codes. In: ACM international conference on multimedia retrieval
Weiss Y, Torralba A, Fergus R (2009) Spectral hashing. In: Advances in neural information processing systems
Wengang Z, Yijuan L U, Li H, Yibing S, Qi T (2010) Spatial coding for large scale partial-duplicate web image search. In: ACM multimedia conference
Wengang Z, Ming Y, Xiaoyu W, Li H, Yuanqing L, Qi T (2016) Scalable feature matching by dual cascaded scalar quantization for image retrieval. IEEE Trans Pattern Anal Mach Intell 38:159–171
Article Google Scholar
Yan X, He K, Fang W, Jian S (2013) Joint inverted indexing. In: IEEE international conference on computer vision
Yannis K, Yannis A (2014) Locally optimized product quantization for approximate nearest neighbor search. In: IEEE conference on computer vision and pattern recognition
Yu S-I, Jiang L, Zhongwen X, Yi Y, Hauptmann AG (2015) Content-based video search over 1 million videos with 1 core in 1 second. In: ACM international conference on multimedia retrieval
Yu L, Huang Z, Shen F, et al (2017) Bilinear optimized product quantization for scalable visual content analysis[J]. IEEE Trans Image Process 26:5057–5069
Article MathSciNet Google Scholar
Zhaohua Z, Jiancong T, Haibing H, Jin L, Li T, Stones RJ, Gang W, Xiaoguang L (2016) Leveraging context-free grammar for efficient inverted index compression. In: ACM international conference on research on development in information retrieval

Download references

Acknowledgements

The work is supported by the National Natural Science Foundation of China under grand No. 61473271 and No. 61331015.

Author information

Authors and Affiliations

The CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application Systems, University of Science and Technology of China, Hefei, China
Kai Zhang, Wengang Zhou, Shaoyan Sun & Bin Li

Authors

Kai Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wengang Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Shaoyan Sun
View author publications
You can also search for this author in PubMed Google Scholar
Bin Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bin Li.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, K., Zhou, W., Sun, S. et al. Multiple complementary inverted indexing based on multiple metrics. Multimed Tools Appl 78, 7727–7747 (2019). https://doi.org/10.1007/s11042-018-6439-x

Download citation

Received: 02 August 2017
Revised: 08 July 2018
Accepted: 20 July 2018
Published: 16 August 2018
Issue Date: March 2019
DOI: https://doi.org/10.1007/s11042-018-6439-x

Multiple complementary inverted indexing based on multiple metrics

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Overview of the SISAP 2023 Indexing Challenge

Effective product quantization-based indexing for nearest neighbor search

Large-scale high-dimensional indexing by sparse hashing with l ₀ approximation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Multiple complementary inverted indexing based on multiple metrics

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Overview of the SISAP 2023 Indexing Challenge

Effective product quantization-based indexing for nearest neighbor search

Large-scale high-dimensional indexing by sparse hashing with l 0 approximation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation

Large-scale high-dimensional indexing by sparse hashing with l ₀ approximation