Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Multiview Discrete Hashing for Scalable Multimedia Search

Published: 01 June 2018 Publication History

Abstract

Hashing techniques have recently gained increasing research interest in multimedia studies. Most existing hashing methods only employ single features for hash code learning. Multiview data with each view corresponding to a type of feature generally provides more comprehensive information. How to efficiently integrate multiple views for learning compact hash codes still remains challenging. In this article, we propose a novel unsupervised hashing method, dubbed multiview discrete hashing (MvDH), by effectively exploring multiview data. Specifically, MvDH performs matrix factorization to generate the hash codes as the latent representations shared by multiple views, during which spectral clustering is performed simultaneously. The joint learning of hash codes and cluster labels enables that MvDH can generate more discriminative hash codes, which are optimal for classification. An efficient alternating algorithm is developed to solve the proposed optimization problem with guaranteed convergence and low computational complexity. The binary codes are optimized via the discrete cyclic coordinate descent (DCC) method to reduce the quantization errors. Extensive experimental results on three large-scale benchmark datasets demonstrate the superiority of the proposed method over several state-of-the-art methods in terms of both accuracy and scalability.

References

[1]
Mikhail Belkin and Partha Niyogi. 2001. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Proceedings of Advances in Neural Information Processing Systems, Vol. 14. 585--591.
[2]
Dimitri P. Bertsekas. 1999. Nonlinear Programming. Athena Scientific.
[3]
Michael M. Bronstein, Alexander M. Bronstein, Fabrice Michel, and Nikos Paragios. 2010. Data fusion through cross-modality metric learning using similarity-sensitive hashing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3594--3601.
[4]
Xinlei Chen and Deng Cai. 2011. Large scale spectral clustering with landmark-based representation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 5. 14.
[5]
Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: A real-world web image database from National University of Singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval. 1--9.
[6]
Guiguang Ding, Yuchen Guo, and Jile Zhou. 2014. Collective matrix factorization hashing for multimodal data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2083--2090.
[7]
Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 1999. Similarity search in high dimensions via hashing. In Proceedings of the International Conference on Very Large Data Bases. 518--529.
[8]
Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2013. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 12 (2013), 2916--2929.
[9]
Gregory Griffin, Alex Holub, and Pietro Perona. 2007. Caltech-256 Object Category Dataset. Technical Report.
[10]
Bin Gu and Victor S. Sheng. 2017. A robust regularization path algorithm for -support vector classification. IEEE Transactions on Neural Networks and Learning Systems 28, 5 (2017), 1241--1248.
[11]
Bin Gu, Victor S. Sheng, KengYeow Tay, Walter Romano, and Shuo Li. 2015. Incremental support vector learning for ordinal regression. IEEE Transactions on Neural Networks and Learning Systems 26, 7 (2015), 1403--1416.
[12]
Bin Gu, Xingming Sun, and Victor S. Sheng. 2017. Structural minimax probability machine. IEEE Transactions on Neural Networks and Learning Systems 28, 7 (2017), 1646--1656.
[13]
Saehoon Kim, Yoonseop Kang, and Seungjin Choi. 2012. Sequential spectral learning to hash with multiple representations. In Proceedings of the European Conference on Computer Vision. 538--551.
[14]
Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images. Master’s thesis. Department of Computer Science, University of Toronto.
[15]
Brian Kulis and Kristen Grauman. 2012. Kernelized locality-sensitive hashing. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 6 (2012), 1092--1104.
[16]
Shaishav Kumar and Raghavendra Udupa. 2011. Learning hash functions for cross-view similarity search. In Proceedings of the International Joint Conference on Artificial Intelligence. 1360--1365.
[17]
Daniel D. Lee and H. Sebastian Seung. 2001. Algorithms for non-negative matrix factorization. In Proceedings of Advances in Neural Information Processing Systems. 556--562.
[18]
Li Liu, Mengyang Yu, and Ling Shao. 2015. Multiview alignment hashing for efficient image search. IEEE Transactions on Image Processing 24, 3 (2015), 956--966.
[19]
Meng Liu, Yong Luo, Dacheng Tao, Chao Xu, and Yonggang Wen. 2015b. Low-rank multi-view learning in matrix completion for multi-label image classification. In Proceedings of the AAAI Conference on Artificial Intelligence. 2778--2784.
[20]
Wei Liu, Junfeng He, and Shih-Fu Chang. 2010. Large graph construction for scalable semi-supervised learning. In Proceedings of the International Conference on Machine Learning. 679--686.
[21]
Wei Liu, Cun Mu, Sanjiv Kumar, and Shih-Fu Chang. 2014. Discrete graph hashing. In Proceedings of Advances in Neural Information Processing Systems. 3419--3427.
[22]
Wei Liu, Jun Wang, Sanjiv Kumar, and Shih-Fu Chang. 2011. Hashing with graphs. In Proceedings of the International Conference on Machine Learning. 1--8.
[23]
Xianglong Liu, Junfeng He, and Bo Lang. 2014. Multiple feature kernel hashing for large-scale visual search. Pattern Recognition 47, 2 (2014), 748--757.
[24]
Xianglong Liu, Lei Huang, Cheng Deng, Bo Lang, and Dacheng Tao. 2016. Query-adaptive hash code ranking for large-scale multi-view visual search. IEEE Transactions on Image Processing 25, 10 (2016), 4514--4524.
[25]
Xianglong Liu, Lei Huang, Cheng Deng, Jiwen Lu, and Bo Lang. 2015a. Multi-view complementary hash tables for nearest neighbor search. In Proceedings of the IEEE International Conference on Computer Vision. 1107--1115.
[26]
Xianglong Liu, Yadong Mu, Danchen Zhang, Bo Lang, and Xuelong Li. 2015. Large-scale unsupervised hashing with shared structure learning. IEEE Transactions on Cybernetics 45, 9 (2015), 1811--1822.
[27]
Yong Luo, Dacheng Tao, Kotagiri Ramamohanarao, Chao Xu, and Yonggang Wen. 2015. Tensor canonical correlation analysis for multi-view dimension reduction. IEEE Transactions on Knowledge and Data Engineering 27, 11 (2015), 3111--3124.
[28]
Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. 2001. On spectral clustering: Analysis and an algorithm. In Proceedings of Advances in Neural Information Processing Systems. 849--856.
[29]
Fumin Shen, Yadong Mu, Yang Yang, Wei Liu, Li Liu, Jingkuan Song, and Heng Tao Shen. 2017b. Classification by retrieval: Binarizing data and classifiers. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 595--604.
[30]
Fumin Shen, Chunhua Shen, Wei Liu, and Heng Tao Shen. 2015. Supervised discrete hashing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 37--45.
[31]
Fumin Shen, Chunhua Shen, Qinfeng Shi, Anton Van Den Hengel, and Zhenmin Tang. 2013. Inductive hashing on manifolds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1562--1569.
[32]
Xiaobo Shen, Weiwei Liu, Ivor W. Tsang, Fumin Shen, and Quan-Sen Sun. 2017a. Compressed K-means for large-scale clustering. In Proceedings of the AAAI Conference on Artificial Intelligence. 2527--2533.
[33]
Xiaobo Shen, Fumin Shen, Quan-Sen Sun, Yang Yang, Yunhao Yuan, and Heng Tao Shen. 2017. Semi-paired discrete hashing: Learning latent hash codes for semi-paired cross-view retrieval. IEEE Transactions on Cybernetics 47, 12 (2017), 4275--4288.
[34]
Xiaobo Shen, Fumin Shen, Quan-Sen Sun, and Yun-Hao Yuan. 2015. Multi-view latent hashing for efficient multimedia search. In Proceedings of the ACM International Conference on Multimedia. 831--834.
[35]
Ajit P. Singh and Geoffrey J. Gordon. 2008. Relational learning via collective matrix factorization. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 650--658.
[36]
Jingkuan Song, Yi Yang, Zi Huang, Heng Tao Shen, and Jiebo Luo. 2013a. Effective multiple feature hashing for large-scale near-duplicate video retrieval. IEEE Transactions on Multimedia 15, 8 (2013), 1997--2008.
[37]
Jingkuan Song, Yang Yang, Yi Yang, Zi Huang, and Heng Tao Shen. 2013b. Inter-media hashing for large-scale retrieval from heterogeneous data sources. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 785--796.
[38]
Qing Tian and Songcan Chen. 2017. Cross-heterogeneous-database age estimation through correlation representation learning. Neurocomputing 238 (2017), 286--295.
[39]
Daixin Wang, Peng Cui, Mingdong Ou, and Wenwu Zhu. 2015. Deep multimodal hashing with orthogonal regularization. In Proceedings of the International Joint Conference on Artificial Intelligence. 2291--2297.
[40]
Jun Wang, Sanjiv Kumar, and Shih-Fu Chang. 2012. Semi-supervised hashing for large-scale search. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 12 (2012), 2393--2406.
[41]
Jingdong Wang, Heng Tao Shen, Jingkuan Song, and Jianqiu Ji. 2014. Hashing for similarity search: A survey. arXiv Preprint arXiv:1408.2927 (2014).
[42]
Meng Wang and Xian-Sheng Hua. 2011. Active learning in multimedia annotation and retrieval: A survey. ACM Transactions on Intelligent Systems and Technology 2, 2 (2011), 10:1--10:21.
[43]
Yunchao Wei, Yao Zhao, Zhenfeng Zhu, Shikui Wei, Yanhui Xiao, Jiashi Feng, and Shuicheng Yan. 2016. Modality-dependent cross-media retrieval. ACM Transactions on Intelligent Systems and Technology 7, 4 (2016), 57:1--57:13.
[44]
Yair Weiss, Antonio Torralba, and Rob Fergus. 2009. Spectral hashing. In Proceedings of Advances in Neural Information Processing Systems. 1753--1760.
[45]
Liping Xie, Dacheng Tao, and Haikun Wei. 2017. Joint structured sparsity regularized multiview dimension reduction for video-based facial expression recognition. ACM Transactions on Intelligent Systems and Technology 8, 2 (2017), 28:1--28:21.
[46]
Chang Xu, Dacheng Tao, and Chao Xu. 2015. Multi-view intact space learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 12 (2015), 2531--2544.
[47]
Yi Yang, Heng Tao Shen, Feiping Nie, Rongrong Ji, and Xiaofang Zhou. 2011. Nonnegative spectral clustering with discriminative regularization. In Proceedings of the AAAI Conference on Artificial Intelligence. 2--4.
[48]
Deming Zhai, Hong Chang, Shiguang Shan, Xilin Chen, and Wen Gao. 2012. Multiview metric learning with global consistency and local smoothness. ACM Transactions on Intelligent Systems and Technology 3, 3 (2012), 53:1--53:22.
[49]
Dan Zhang, Fei Wang, and Luo Si. 2011. Composite hashing with multiple information sources. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 225--234.
[50]
Zhili Zhou, Q. M. Jonathan Wu, Fang Huang, and Xingming Sun. 2017. Fast and accurate near-duplicate image elimination for visual sensor networks. International Journal of Distributed Sensor Networks 13, 2 (2017).
[51]
Xiaofeng Zhu, Zi Huang, Heng Tao Shen, and Xin Zhao. 2013. Linear cross-modal hashing for efficient multimedia search. In Proceedings of the ACM International Conference on Multimedia. 143--152.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology  Volume 9, Issue 5
Research Survey and Regular Papers
September 2018
274 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/3210369
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 2018
Accepted: 01 December 2017
Revised: 01 October 2017
Received: 01 April 2017
Published in TIST Volume 9, Issue 5

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Hashing
  2. multi-view
  3. multimedia search

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • National Science Foundation of China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)23
  • Downloads (Last 6 weeks)2
Reflects downloads up to 23 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Supervised online multi-modal discrete hashingSignal Processing10.1016/j.sigpro.2024.109872231(109872)Online publication date: Jun-2025
  • (2025)CLIP Multi-modal Hashing for Multimedia RetrievalMultiMedia Modeling10.1007/978-981-96-2054-8_15(195-205)Online publication date: 9-Jan-2025
  • (2024)Multi-Facet Weighted Asymmetric Multi-Modal Hashing Based on Latent Semantic DistributionIEEE Transactions on Multimedia10.1109/TMM.2024.336366426(7307-7320)Online publication date: 2024
  • (2024)Similarity Transitivity Broken-Aware Multi-Modal HashingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.339649236:11(7003-7014)Online publication date: 1-Nov-2024
  • (2024)Multi-Modal Hashing for Efficient Multimedia Retrieval: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.328292136:1(239-260)Online publication date: 1-Jan-2024
  • (2024)Relaxed Energy Preserving Hashing for Image RetrievalIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.335184125:7(7388-7400)Online publication date: 1-Jul-2024
  • (2024)A Multi-View Double Alignment Hashing Network with Weighted Contrastive Learning2024 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME57554.2024.10687739(1-6)Online publication date: 15-Jul-2024
  • (2024)Adaptive Loss-aware Modulation for Multimedia Retrieval2024 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM59182.2024.00072(649-658)Online publication date: 9-Dec-2024
  • (2024)SCH: Symmetric Consistent Hashing for cross-modal retrievalSignal Processing10.1016/j.sigpro.2023.109255215(109255)Online publication date: Feb-2024
  • (2024)Efficient Supervised Graph Embedding Hashing for large-scale cross-media retrievalPattern Recognition10.1016/j.patcog.2023.109934145(109934)Online publication date: Jan-2024
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media