Deep hashing for multi-label image retrieval: a survey

Josiane Rodrigues¹,
Marco Cristo² &
Juan G. Colonna²

2348 Accesses
28 Citations
3 Altmetric
Explore all metrics

Abstract

Content-based image retrieval (CBIR) aims to display, as a result of a search, images with the same visual contents as a query. This problem has attracted increasing attention in the area of computer vision. Learning-based hashing techniques are amongst the most studied search approaches for approximate nearest neighbors in large-scale image retrieval. With the advance of deep neural networks in image representation, hashing methods for CBIR have started using deep learning to build binary codes. Such strategies are generally known as deep hashing techniques. In this paper, we present a comprehensive deep hashing survey for the task of image retrieval with multiple labels, categorizing the methods according to how the input images are treated: pointwise, pairwise, tripletwise and listwise, as well as their relationships. In addition, we present discussions regarding the cost of space, efficiency and search quality of the described models, as well as open issues and future work opportunities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Supervised Hashing with Triplet Labels

Multiple hierarchical deep hashing for large scale image retrieval

Article 27 February 2017

Deep Supervised Hashing for Fast Image Retrieval

Article 16 March 2019

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

Cross-modal is a type of approach that uses two or more different modalities of signal representation as input for neural network (Jiang and Li 2016).
The Hadamard product is a binary operation between matrices of the same dimension such that $A = B \odot C$ implies that $A_{i,j} = B_{i, j} C_{i,j}$.
The Jaccard coefficient measures the similarity between finite sample sets and is defined as the intersection size divided by the joint size of the sample sets.
https://pytorch.org/.

References

Andoni A, Indyk P (2006) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: 2006 47th annual IEEE symposium on foundations of computer science (FOCS’06), pp 459–468. https://doi.org/10.1109/FOCS.2006.49
Baeza-Yates R, Ribeiro-Neto B et al (1999) Modern information retrieval, vol 463. ACM Press, New York
Google Scholar
Bezerra E (2016) Introdução à aprendizagem profunda. XXXI Simposio Brasileiro de Banco de Dados
Buda M, Maki A, Mazurowski MA (2018) A systematic study of the class imbalance problem in convolutional neural networks. Neural Netw 106:249–259
Article Google Scholar
Cakir F, He K, Bargal SA, Sclaroff S (2018) Hashing with mutual information. arXiv preprint arXiv:1803.00974
Canziani A, Paszke A, Culurciello E (2016) An analysis of deep neural network models for practical applications. arXiv:1605.07678
Cao Y, Long M, Wang J, Zhu H, Wen Q (2016) Deep quantization network for efficient image retrieval. In: AAAI, pp 3457–3463
Chen Z, Cai R, Lu J, Feng J, Zhou J (2018) Order-sensitive deep hashing for multimorbidity medical image retrieval. In: International conference on medical image computing and computer-assisted intervention. Springer, Berlin, pp 620–628
Chua TS, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nus-wide: a real-world web image database from National University of Singapore. In: Proceedings of the ACM international conference on image and video retrieval. ACM, p 48
Courbariaux M, Hubara I, Soudry D, El-Yaniv R, Bengio Y (2016) Binarized neural networks: Training deep neural networks with weights and activations constrained to $+1$ or $-1$. arXiv preprint arXiv:1602.02830
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, 2009. CVPR 2009. IEEE, pp 248–255
Do TT, Doan AD, Cheung NM (2016) Learning to hash with binary deep neural network. In: European conference on computer vision. Springer, Berlin, pp 219–234
Erin Liong V, Lu J, Wang G, Moulin P, Zhou J (2015) Deep hashing for compact binary codes learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2475–2483
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Gong Y, Kumar S, Verma V, Lazebnik S (2012) Angular quantization-based binary codes for fast similarity search. In: Advances in neural information processing systems, pp 1196–1204
Gong Y, Kumar S, Rowley HA, Lazebnik S (2013a) Learning binary codes for high-dimensional data using bilinear projections. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 484–491
Gong Y, Lazebnik S, Gordo A, Perronnin F (2013b) Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans Pattern Anal Mach Intell 35(12):2916–2929
Article Google Scholar
Grubinger M, Clough P, Müller H, Deselaers T (2006) The iapr tc-12 benchmark: a new evaluation resource for visual information systems. In: International workshop OntoImage, vol 5
Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol 2. IEEE, pp 1735–1742
He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision. Springer, Berlin, pp 346–361
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hijazi S, Kumar R, Rowen C (2015) Using convolutional neural networks for image recognition
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
Huang CQ, Yang SM, Pan Y, Lai HJ (2018) Object-location-aware hashing for multi-label image retrieval via automatic mask learning. IEEE Trans Image Process 27(9):4490–4502
Article MathSciNet Google Scholar
Huiskes MJ, Lew MS (2008) The MIR flickr retrieval evaluation. In: Proceedings of the 1st ACM international conference on multimedia information retrieval. ACM, pp 39–43
Jain P, Kulis B, Grauman K (2008) Fast image search for learned metrics. In: 2008 IEEE conference on computer vision and pattern recognition. IEEE. https://doi.org/10.1109/CVPR.2008.4587841
Järvelin K, Kekäläinen J (2000) IR evaluation methods for retrieving highly relevant documents. In: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval. ACM, pp 41–48
Järvelin K, Kekäläinen J (2002) Cumulated gain-based evaluation of IR techniques. ACM Trans Inf Syst 20(4):422–446
Article Google Scholar
Jiang QY, Li WJ (2016) Deep cross-modal hashing. In: CoRR
Khan SH, Hayat M, Bennamoun M, Sohel FA, Togneri R (2018) Cost-sensitive learning of deep feature representations from imbalanced data. IEEE Trans Neural Netw Learn Syst 29(8):3573–3587
Article Google Scholar
Krähenbühl P, Koltun V (2014) Geodesic object proposals. In: European conference on computer vision. Springer, Cham, pp 725–739
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Kulis B, Grauman K (2009) Kernelized locality-sensitive hashing for scalable image search. In: 2009 IEEE 12th international conference on computer vision. IEEE, pp 2130–2137
Kulis B, Jain P, Grauman K (2009) Fast similarity search for learned metrics. IEEE Trans Pattern Anal Mach Intell 31(12):2143–2157
Article Google Scholar
Lai H, Pan Y, Liu Y, Yan S (2015) Simultaneous feature learning and hash coding with deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3270–3278
Lai H, Yan P, Shu X, Wei Y, Yan S (2016) Instance-aware hashing for multi-label image retrieval. IEEE Trans Image Process 25(6):2469–2479
Article MathSciNet Google Scholar
Li WJ, Wang S, Kang WC (2015) Feature learning based deep supervised hashing with pairwise labels. arXiv preprint arXiv:1511.03855
Li T, Gao S, Xu Y (2017) Deep multi-similarity hashing for multi-label image retrieval. In: Proceedings of the 2017 ACM on conference on information and knowledge management. ACM, pp 2159–2162
Li Y, Miao Z, He M, Zhang Y, Li H (2018) Deep attention residual hashing. IEICE Trans Fundam Electron Commun Comput Sci 101(3):654–657
Article Google Scholar
Liang D, Yan K, Wang Y, Zeng W, Yuan Q, Bao X, Tian Y (2017) Deep hashing with multi-task learning for large-scale instance-level vehicle search. In: 2017 IEEE international conference on multimedia and Expo workshops (ICMEW). IEEE, pp 192–197
Lin G, Shen C, Suter D, Van Den Hengel A (2013) A general two-step approach to learning-based hashing. In: Proceedings of the IEEE international conference on computer vision, pp 2552–2559
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, Berlin, pp 740–755
Lin K, Yang HF, Hsiao JH, Chen CS (2015) Deep learning of binary hash codes for fast image retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 27–35
Lin K, Lu J, Chen CS, Zhou J (2016) Learning compact binary descriptors with unsupervised deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1183–1192
Liu L, Qi H (2018) Discriminative cross-view binary representation learning. In: 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, pp 1736–1744
Liu Y, Zhang D, Lu G, Ma WY (2007) A survey of content-based image retrieval with high-level semantics. Pattern Recogn 40(1):262–282
Article Google Scholar
Liu TY et al (2009) Learning to rank for information retrieval. Found Trends® Inf Retr 3(3):225–331
Article Google Scholar
Liu H, Wang R, Shan S, Chen X (2016) Deep supervised hashing for fast image retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2064–2072
Liu L, Rahimpour A, Taalimi A, Qi H (2017a) End-to-end binary representation learning via direct binary embedding. In: 2017 IEEE international conference on image processing (ICIP). IEEE, pp 1257–1261
Liu W, Ma H, Qi H, Zhao D, Chen Z (2017b) Deep learning hashing for mobile visual search. EURASIP J Image Video Process. https://doi.org/10.1186/s13640-017-0167-4
Article Google Scholar
Lu J, Liong VE, Zhou X, Zhou J (2015) Learning compact binary face descriptor for face recognition. IEEE Trans Pattern Anal Mach Intell 37(10):2041–2056
Article Google Scholar
Lu J, Liong VE, Zhou J (2017) Deep hashing for scalable image search. IEEE Trans Image Process 26(5):2352–2367
Article MathSciNet Google Scholar
Ma C, Chen Z, Lu J, Zhou J (2018) Rank-consistency multi-label deep hashing. In: 2018 IEEE international conference on multimedia and expo (ICME). IEEE, pp 1–6
Norouzi M, Blei DM (2011) Minimal loss hashing for compact binary codes. In: Proceedings of the 28th international conference on machine learning (ICML-11). Citeseer, pp 353–360
Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Raginsky M, Lazebnik S (2009) Locality-sensitive binary codes from shift-invariant kernels. In: Advances in neural information processing systems, pp 1509–1517
Rahmani R, Goldman SA, Zhang H, Krettek J, Fritts JE (2005) Localized content based image retrieval. In: Proceedings of the 7th ACM SIGMM international workshop on multimedia information retrieval. ACM, pp 227–236
Rehman M, Iqbal M, Sharif M, Raza M (2012) Content based image retrieval: survey. World Appl Sci J 19(3):404–412
Google Scholar
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823
Shen F, Gao X, Liu L, Yang Y, Shen HT (2017) Deep asymmetric pairwise hashing. In: Proceedings of the 25th ACM international conference on multimedia. ACM, pp 1522–1530
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR
Singhai N, Shandilya SK (2010) A survey on: content based image retrieval systems. Int J Comput Appl 4(2):22–26
Google Scholar
Song G, Tan X (2018) Learning multilevel semantic similarity for large-scale multi-label image retrieval. In: Proceedings of the 2018 ACM on international conference on multimedia retrieval. ACM, pp 64–72
Stutz D (2014) Understanding convolutional neural networks. In: Seminar report, Fakultät für Mathematik, Informatik und Naturwissenschaften Lehr-und Forschungsgebiet Informatik VIII computer vision
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: 2016 IEEE conference on computer vision and pattern recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016. IEEE Computer Society, pp 2818–2826. https://doi.org/10.1109/CVPR.2016.308
Wan J, Wu P, Hoi SC, Zhao P, Gao X, Wang D, Zhang Y, Li J (2015) Online learning to rank for content-based image retrieval. In: Twenty-fourth international joint conference on artificial intelligence
Wang J, Kumar S, Chang SF (2012) Semi-supervised hashing for large-scale search. IEEE Trans Pattern Anal Mach Intell 34(12):2393–2406
Article Google Scholar
Wang J, Shen H.T, Song J, Ji J (2014) Hashing for similarity search: a survey. arXiv preprint arXiv:1408.2927
Wang J, Liu W, Kumar S, Chang SF (2016a) Learning to hash for indexing big data—a survey. Proc IEEE 104(1):34–57
Article Google Scholar
Wang X, Shi Y, Kitani KM (2016b) Deep supervised hashing with triplet labels. In: Asian conference on computer vision. Springer, Berlin, pp 70–84
Wang D, Huang H, Lin HK, Mao XL (2017a) Supervised hashing for multi-labeled data with order-preserving feature. In: Chinese national conference on social media processing. Springer, Berlin, pp 16–28
Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017b) Residual attention network for image classification. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 6450–6458. https://doi.org/10.1109/CVPR.2017.683
Wang J, Zhang T, Song J, Sebe N, Shen HT (2018) A survey on learning to hash. IEEE Trans Pattern Anal Mach Intell 40(4):769–790. https://doi.org/10.1109/TPAMI.2017.2699960
Article Google Scholar
Weiss Y, Torralba A, Fergus R (2009) Spectral hashing. In: Advances in neural information processing systems, pp 1753–1760
Wu D, Lin Z, Li B, Ye M, Wang W (2017) Deep supervised hashing for multi-label and large-scale image retrieval. In: Proceedings of the 2017 ACM on international conference on multimedia retrieval. ACM, pp 150–158
Wu D, Lin Z, Li B, Liu J, Wang W (2018) Deep uniqueness-aware hashing for fine-grained multi-label image retrieval. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1683–1687
Xia R, Pan Y, Lai H, Liu C, Yan S (2014) Supervised hashing for image retrieval via image representation learning. In: Twenty-eighth AAAI conference on artificial intelligence
Xu J, Wang P, Tian G, Xu B, Zhao J, Wang F, Hao H (2015) Convolutional neural networks for text hashing. In: IJCAI, pp 1369–1375
Yang H, Lin K, Chen C (2018) Supervised learning of semantics-preserving hash via deep convolutional neural networks. IEEE Trans Pattern Anal Mach Intell 40(2):437–451. https://doi.org/10.1109/TPAMI.2017.2666812
Article Google Scholar
Zhang H, Liu L, Long Y, Shao L (2017) Unsupervised deep hashing with pseudo labels for scalable image retrieval. IEEE Trans Image Process 27(4):1626–1638
Article MathSciNet Google Scholar
Zhang Z, Zou Q, Wang Q, Lin Y, Li Q (2018) Instance similarity deep hashing for multi-label image retrieval. arXiv preprint arXiv:1803.02987
Zhao F, Huang Y, Wang L, Tan T (2015) Deep semantic ranking based hashing for multi-label image retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1556–1564
Zhong C, Yu Y, Tang S, Satoh S, Xing K (2017) Deep multi-label hashing for large-scale visual search based on semantic graph. Asia-Pacific Web (APWeb) and web-age information management (WAIM) joint conference on web and big data. Springer, Berlin, pp 169–184
Zhou Y, Huang S, Zhang Y, Wang Y (2017) Deep hashing with triplet quantization loss. In: Visual communications and image processing (VCIP), 2017 IEEE. IEEE, pp 1–4
Zhu H, Long M, Wang J, Cao Y (2016) Deep hashing network for efficient similarity retrieval. In: AAAI, pp 2415–2421
Zhu Y, Li Y, Wang S (2019) Unsupervised deep hashing with adaptive feature learning for image retrieval. IEEE Signal Process Lett 26(3):395–399
Article Google Scholar
Zhuang B, Lin G, Shen C, Reid I (2016) Fast training of triplet-based deep binary embedding networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5955–5964

Download references

Author information

Authors and Affiliations

Instituto Federal de Rondônia, Porto Velho, Brazil
Josiane Rodrigues
Universidade Federal do Amazonas, Manaus, Brazil
Marco Cristo & Juan G. Colonna

Authors

Josiane Rodrigues
View author publications
You can also search for this author in PubMed Google Scholar
Marco Cristo
View author publications
You can also search for this author in PubMed Google Scholar
Juan G. Colonna
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Josiane Rodrigues.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rodrigues, J., Cristo, M. & Colonna, J.G. Deep hashing for multi-label image retrieval: a survey. Artif Intell Rev 53, 5261–5307 (2020). https://doi.org/10.1007/s10462-020-09820-x

Download citation

Published: 27 February 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s10462-020-09820-x

Deep hashing for multi-label image retrieval: a survey

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Supervised Hashing with Triplet Labels

Multiple hierarchical deep hashing for large scale image retrieval

Deep Supervised Hashing for Fast Image Retrieval

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Deep hashing for multi-label image retrieval: a survey

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Supervised Hashing with Triplet Labels

Multiple hierarchical deep hashing for large scale image retrieval

Deep Supervised Hashing for Fast Image Retrieval

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation