Abstract
The strong feature representation ability of deep learning enables content-based image retrieval (CBIR) to achieve higher retrieval accuracy, while there are still some challenges for CBIR such as high requirements of training labels and retrieve efficiency. In this paper, we propose an unsupervised adversarial image retrieval (UAIR) framework by breaking the limitation of training labels. The framework is composed of two opposite parts and is linked by an adversarial loss function. For each input image, a generative model is used to select “well-matched” images from the database; a discriminative model is used to distinguish whether the selected images are similar enough to the input image. During training, the generative model tries to convince the discriminative model that the selected images are similar and the discriminative model always challenges the results of the generative model. The performances of the UAIR have been compared with other state-of-the-art image retrieval methods, including recently reported GAN-based methods. Extensive experiments show that the UAIR achieves significant improvement in CBIR with unsupervised adversarial training.
Similar content being viewed by others
References
Amato, G., Carrara, F., Falchi, F., Gennaro, C., Vadicamo, L.: Large-scale instance-level image retrieval. Inform. Process. Manag. 57, 102100 (2019)
SKY, B., Reddy, S.K., Mishra, A.: A zero-shot framework for sketch based image retrieval. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) ECCV 2018, Springer International Publishing, Lecture Notes in Computer Science, vol 11219, pp 316–333, 1311.2901 (2018)
Bai, C., Huang, L., Pan, X., Zheng, J., Chen, S.: Optimization of deep convolutional neural network for large scale image retrieval. Neurocomputing 303, 60–67 (2018)
Cao, Y., Liu, B., Long, M., Wang, J.: Hashgan: Deep learning to hash with pair conditional wasserstein gan. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Chadha, A., Andreopoulos, Y.: Voronoi-based compact image descriptors: efficient region-of-interest retrieval with VLAD and deep-learning-based descriptors. IEEE Trans. Multimed. 19(7), 1596–1608 (2017)
Chen, Z., Lu, J., Feng, J., Zhou, J.: Nonlinear discrete hashing. IEEE Trans. Multimed. 19(1), 123–135 (2017)
Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.T.: Nus-wide: a real-world web image database from national university of Singapore. In: Proc. of ACM Conf. on Image and Video Retrieval (CIVR’09), Santorini, Greece (2009)
Clarivate (2020) Web of science. https://apps.webofknowledge.com/. Accessed Dec 2020
Creswell, A., Bharath, A.A.: Adversarial training for sketch retrieval. In: European Conference on Computer Vision Workshops, pp 798–809 (2016)
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval : ideas, influences, and trends of the new Age. ACM Comput. Surv. 40(2), 1–60 (2008)
Gan, Y., Gong, J., Ye, M., Qian, Y., Liu, K.: Unpaired cross domain image translation with augmented auxiliary domain information. Neurocomputing 316, 112–123 (2018)
Gao, J., Yang, X., Zhang, Y., Xu, C.: Unsupervised video summarization via relation-aware assignment learning. IEEE Trans. Multimed. 23, 3203–3214 (2020)
Gao, J., Zhang, T., Xu, C.: Learning to model relationships for zero-shot video classification. IEEE Trans. Pattern Anal. Mach. Intell. 43, 3476–3491 (2020)
Ghasedi Dizaji, K., Zheng, F., Sadoughi, N., Yang, Y., Deng, C., Huang, H.: Unsupervised deep generative adversarial hashing network. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Gong, Y., Lazebnik, S., Gordo, A., Perronnin, F.: Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2916–2929 (2013)
Gong, Y., Wang, L., Guo, R., Lazebnik, S.: Multi-scale orderless pooling of deep convolutional activation features. In: European Conference on Computer Vision, pp 392–407 (2014)
Goodfellow, I., Pougetabadie, J., Mirza, M., Xu, B., Wardefarley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. Adv. Neural Inform. Process. Syst. 27, 2672–2680 (2014)
Gordo, A., Almazán, J., Revaud, J., Larlus, D.: Deep image retrieval: learning global representations for image search. In: European Conference on Computer Vision, pp 241–257 (2016)
Guo, L., Liu, J., Wang, Y., Luo, Z., Wen, W., Lu, H.: Sketch-based image retrieval using generative adversarial networks. In: Proceedings of the ACM on Multimedia Conference, pp 1267–1268 (2017)
Hashemi, A.S., Mozaffari, S.: Secure deep neural networks using adversarial image generation and training with Noise-GAN. Comput. Secur. 86, 372–387 (2019)
Heo, J.P., Lee, Y., He, J., Chang, S.F., Yoon, S.E.: Spherical hashing: binary code embedding with hyperspheres. IEEE Trans. Pattern Anal. Mach. Intell. 37(11), 2304–2316 (2015)
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 2261–2269 (2017)
Huang, L., Bai, C., Lu, Y., Chen, S., Tian, Q.: Adversarial learning for Content-based Image Retrieval. In: 2nd IEEE International Conference on Multimedia Information Processing and Retrieval (MIPR), IEEE; IEEE Comp Soc, pp 97–102 (2019)
Iscen, A., Tolias, G., Avrithis, Y., Furon, T., Chum, O.: Efficient diffusion on region manifolds: Recovering small objects with compact cnn representations. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 926–935 (2017)
Iscen, A., Avrithis, Y., Tolias, G., Furon, T., Chum, O.: Fast spectral ranking for similarity search. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7632–7641 (2018)
Kang, Y., Kim, S., Choi, S.: Deep learning to hash with multiple representations. In: IEEE 12th International Conference on Data Mining, pp 930–935 (2012)
Krizhevsky, A.: Learning multiple layers of features from tiny images. Master’s thesis, University of Toronto, Toronto, Canada (2009)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp 1097–1105 (2012)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Li, P., Cheng, J., Lu, H.: Hashing with dual complementary projection learning for fast image retrieval. Neurocomputing 120, 83–89 (2013)
Li, Z., Tang, J.: Weakly supervised deep metric learning for community-contributed image retrieval. IEEE Trans. Multimed. 17(11), 1989–1999 (2015)
Liang, J., Hu, Q., Wang, W., Han, Y.: Semisupervised online multikernel similarity learning for image retrieval. IEEE Trans. Multimed. 19(5), 1077–1089 (2017)
Lu, J., Liong, V.E., Zhou, J.: Deep hashing for scalable image search. IEEE Trans. Image Process. 26(5), 2352–2367 (2017)
Meden, B., Mallı, R.C., Fabijan, S., Ekenel, H.K., Struc, V., Peer, P.: Face deidentification with generative deep neural networks. IET Signal Proc. 11(9), 1046–1054 (2017)
Mishra, D., Chaudhury, S., Sarkar, M., Soin, A.S.: Ultrasound image enhancement using structure oriented adversarial network. IEEE Signal Process. Lett. 25(9), 1349–1353 (2018)
Norouzi, M., Fleet, D.J., Salakhutdinov, R.R.: Hamming distance metric learning. Adv. Neural Inform. Process. Syst. 25, 1061–1069 (2012)
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)
Pang, S., Ma, J., Zhu, J., Xue, J., Tian, Q.: Improving object retrieval quality by integration of similarity propagation and query expansion. IEEE Trans. Multimed. 21, 1 (2018)
Shamna, P., Govindan, V., Nazeer, K.A.: Content based medical image retrieval using topic and location model. J. Biomed. Inform. 91, 103112 (2019)
Shang, F., Zhang, H., Zhu, L., Sun, J.: Adversarial cross-modal retrieval based on dictionary learning. Neurocomputing 355, 93–104 (2019)
Shen, F., Shen, C., Liu, W., Shen, H.T.: Supervised discrete hashing. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 37–45 (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. (2015). arXiv preprint arXiv:14091556
Song, J., He, T., Gao, L., Xu, X., Hanjalic, A., Shen, H.T.: Binary generative adversarial networks for image retrieval. In: AAAI Conference on Artificial Intelligence, pp 394–401 (2018)
Tan, W.R., Chan, C.S., Aguirre, H.E., Tanaka, K.: Improved ArtGAN for conditional synthesis of natural image and artwork. IEEE Trans. Image Process. 28(1), 394–409 (2019)
Tian, X., Zhou, X., Ng, W.W., Li, J., Wang, H.: Bootstrap dual complementary hashing with semi-supervised re-ranking for image retrieval. Neurocomputing 379, 103–116 (2019)
Vedaldi, A., Lenc, K.: Matconvnet: convolutional neural networks for matlab. In: Proceedings of the ACM International Conference on Multimedia, pp 689–692 (2015)
Wan, J., Wang, D., Hoi, S.C.H., Wu, P., Zhu, J., Zhang, Y., Li, J.:Deep learning for content-based image retrieval: a comprehensive study. In: Proceedings of the ACM International Conference on Multimedia, pp 157–166 (2014)
Wang, B., Yang, Y., Xu, X., Hanjalic, A., Shen, H.T.: Adversarial cross-modal retrieval. In: Proceedings of the ACM on Multimedia Conference, pp 154–162 (2017)
Wang, H., Cai, Y., Zhang, Y., Pan, H., Lv, W., Han, H.: Deep learning for image retrieval: what works and what doesn’t. In: IEEE International Conference on Data Mining Workshop, pp 1576–1583 (2015)
Wang, W., Gao, J., Yang, X., Xu, C.: Learning coarse-to-fine graph neural networks for video-text retrieval. IEEE Trans. Multimed. 23, 2386–2397 (2020)
Wu, Y., Gao, F., Huang, Y., Lin, J., Chandrasekhar, V., Yuan, J., Duan, L.Y.: Codebook-free compact descriptor for scalable visual search. IEEE Trans. Multimed. 21(2), 388–401 (2019)
Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 3485–3492 (2010)
Xie, L., Wang, J., Zhang, B., Tian, Q.: Fine-grained image search. IEEE Trans. Multimed. 17(5), 636–647 (2015)
Xu, W., Keshmiri, S., Wang, G.R.: Adversarially approximated autoencoder for image generation and manipulation. IEEE Trans. Multimed. 21, 1 (2019)
Xu, X., Song, J., Lu, H., Yang, Y., Shen, F., Huang, Z.: Modal-adversarial semantic learning network for extendable cross-modal retrieval. In: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval - ICMR ’18, pp 46–54 (2018)
Yandex, A.B., Lempitskym, V.: Aggregating local deep features for image retrieval. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1269–1277 (2015)
Yu, L., Zhang, W., Wang, J., Yu, Y.: Seqgan: sequence generative adversarial nets with policy gradient. In: AAAI, pp 2852–2858 (2017)
Zhang, X., Li, X., Li, X., Shen, M.: Better freehand sketch synthesis for sketch-based image retrieval: Beyond image edges. Neurocomputing 322, 38–46 (2018)
Zhao, D., Weng, J., Liu, Y.: Generating traffic scene with deep convolutional generative adversarial networks. In: Chinese Automation Congress, pp 6612–6617 (2017)
Zhao, G., Zhang, M., Liu, J., Wen, J.R.: Unsupervised adversarial attacks on deep feature-based retrieval with GAN. (2019) arXiv preprint arXiv:190705793
Zheng, L., Yang, Y., Tian, Q.: SIFT meets CNN: a decade survey of instance retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 40(5), 1224–1244 (2018)
Zhou, X., Shen, F., Liu, L., Liu, W., Nie, L., Yang, Y., Shen, H.T.: Graph convolutional network hashing. IEEE Trans. Cybern. Early Access 50, 1–13 (2018)
Acknowledgements
This research is funded by Zhejiang Provincial Natural Science Foundation of China under Grant No. LR21F020002, National Natural Science Foundation of China under Grant No. 61976192.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by B-K Bao.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Huang, L., Bai, C., Lu, Y. et al. Unsupervised adversarial image retrieval. Multimedia Systems 28, 673–685 (2022). https://doi.org/10.1007/s00530-021-00866-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-021-00866-7