Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3078971.3078977acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Transductive Visual-Semantic Embedding for Zero-shot Learning

Published: 06 June 2017 Publication History

Abstract

Zero-shot learning (ZSL) aims to bridge the knowledge transfer via available semantic representations (e.g., attributes) between labeled source instances of seen classes and unlabelled target instances of unseen classes. Most existing ZSL approaches achieve this by learning a projection from the visual feature space to the semantic representation space based on the source instances, and directly applying it to the target instances. However, the intrinsic manifold structures residing in both semantic representations and visual features are not effectively incorporated into the learned projection function. Moreover, these methods may suffer from the inherent projection shift problem, due to the disjointness between seen and unseen classes. To overcome these drawbacks, we propose a novel framework termed transductive visual-semantic embedding (TVSE) for ZSL. In specific, TVSE first learns a latent embedding space to incorporate the manifold structures in both labeled source instances and unlabeled target instances under the transductive setting. In the learned space, each instance is viewed as a mixture of seen class scores. TVSE then effectively constructs the relational mapping between seen and unseen classes using the available semantic representations, and applies it to map the seen class scores of the target instances to their predictions of unseen classes. Extensive experiments on four benchmark datasets demonstrate that the proposed TVSE achieves competitive performance compared with the state-of-the-arts for zero-shot recognition and retrieval tasks.

References

[1]
Zeynep Akata, Florent Perronnin, Zaïd Harchaoui, and Cordelia Schmid. 2013. Label-Embedding for Attribute-Based Classification. In CVPR. 819--826.
[2]
Zeynep Akata, Scott E. Reed, Daniel Walter, Honglak Lee, and Bernt Schiele. 2015. Evaluation of output embeddings for negrained image classification. In CVPR. 2927--2936.
[3]
Stephen Boyd and Lieven Vandenberghe. 2004. Convex Optimization.
[4]
Deng Cai, Xiaofei He, Xuanhui Wang, Hujun Bao, and Jiawei Han. 2009. Locality Preserving Nonnegative Matrix Factorization. In IJCAI. 1010--1015.
[5]
Soravit Changpinyo, Wei-Lun Chao, Boqing Gong, and Fei Sha. 2016. Synthesized Classifiers for Zero-Shot Learning. In CVPR. 580--587.
[6]
Ali Farhadi, Ian Endres, Derek Hoiem, and David A. Forsyth. 2009. Describing objects by their attributes. In CVPR. 1778--1785.
[7]
Andrea Frome, Gregory S. Corrado, Jonathon Shlens, Samy Bengio, Jeffrey Dean, Marc'Aurelio Ranzato, and Tomas Mikolov. 2013. DeViSE: A Deep Visual-Semantic Embedding Model. In NIPS. 2121--2129.
[8]
Yanwei Fu, Timothy M. Hospedales, Tao Xiang, Zhen-Yong Fu, and Shaogang Gong. 2014. Transductive Multi-view Embedding for Zero-Shot Recognition and Annotation. In ECCV. 584--599.
[9]
Yanwei Fu, Timothy M. Hospedales, Tao Xiang, and Shaogang Gong. 2015. Transductive Multi-View Zero-Shot Learning. IEEE Trans. Pattern Anal. Mach. Intell. 37, 11 (2015), 2332--2345.
[10]
Zhen-Yong Fu, Tao A. Xiang, Elyor Kodirov, and Shaogang Gong. 2015. Zero-shot object recognition by semantic manifold distance. In CVPR. 2635--2644.
[11]
Chuang Gan, Tianbao Yang, and Boqing Gong. 2016. Learning Attributes Equals Multi-Source Domain Generalization. In CVPR. 87--97.
[12]
Elyor Kodirov, Tao Xiang, Zhen-Yong Fu, and Shaogang Gong. 2015. Unsupervised Domain Adaptation for Zero-Shot Learning. In ICCV. 2452--2460.
[13]
Christoph H. Lampert, Hannes Nickisch, and Stefan Harmeling. 2009. Learning to detect unseen object classes by between-class attribute transfer. In CVPR. 951--958.
[14]
Christoph H. Lampert, Hannes Nickisch, and Stefan Harmeling. 2014. Attribute-Based Classification for Zero-Shot Visual Object Categorization. IEEE Trans. Pattern Anal. Mach. Intell. 36, 3 (2014), 453--465.
[15]
Hugo Larochelle, Dumitru Erhan, and Yoshua Bengio. 2008. Zero-data Learning of New Tasks. In AAAI. 646--651.
[16]
Jingen Liu, Benjamin Kuipers, and Silvio Savarese. 2011. Recognizing human actions by attributes. In CVPR. 3337--3344.
[17]
Mingsheng Long, Jianmin Wang, Guiguang Ding, Dou Shen, and Qiang Yang. 2012. Transfer Learning with Graph Co-Regularization. In AAAI.
[18]
Thomas Mensink, Jakob Verbeek, Florent Perronnin, and Gabriela Csurka. 2012. Metric Learning for Large Scale Image Classifcation: Generalizing to New Classes at Near-Zero Cost. In ECCV. 488--501.
[19]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013).
[20]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Je rey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS. 3111--3119.
[21]
Mohammad Norouzi, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea Frome, Greg Corrado, and Je rey Dean. 2013. Zero-Shot Learning by Convex Combination of Semantic Embeddings. NIPS (2013), 410--418.
[22]
Mark Palatucci, Dean Pomerleau, Geo rey E. Hinton, and Tom M. Mitchell. 2009. Zero-shot Learning with Semantic Output Codes. In NIPS. 1410--1418.
[23]
Genevieve Patterson, Chen Xu, Hang Su, and James Hays. 2014. The SUN Attribute Database: Beyond Categories for Deeper Scene Understanding. International Journal of Computer Vision 108, 1--2 (2014), 59--81.
[24]
Marcus Rohrbach, Sandra Ebert, and Bernt Schiele. 2013. Transfer Learning in a Transductive Setting. In NIPS. 46--54.
[25]
Bernardino Romera-Paredes and Philip H. S. Torr. 2015. An embarrassingly simple approach to zero-shot learning. In ICML. 2152--2161.
[26]
Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR abs/1409.1556 (2014).
[27]
Richard Socher, Milind Ganjoo, Christopher D. Manning, and Andrew Y. Ng. 2013. Zero-Shot Learning Through Cross-Modal Transfer. In NIPS. 935--943.
[28]
Laurens van der Maaten and Geo rey E. Hinton. 2008. Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9 (2008), 2579--2605.
[29]
C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. 2011. The Caltech-UCSD Birds-200--2011 Dataset. Technical Report CNS-TR-2011-001. California Institute of Technology.
[30]
Donghui Wang, Yanan Li, Yuetan Lin, and Yueting Zhuang. 2016. Relational Knowledge Transfer for Zero-Shot Learning. In AAAI. 2145--2151.
[31]
Xing Xu, Fumin Shen, Yang Yang, Dongxiang Zhang, Heng Tao Shen, and Jingkuan Song. 2017. Matrix Tri-Factorization with Manifold Regularizations for Zero-shot Learning. In CVPR.
[32]
Yang Yang, Yadan Luo, Weilun Chen, Fumin Shen, Jie Shao, and Heng Tao Shen. 2016. Zero-Shot Hashing via Transferring Supervised Knowledge. In ACM MM. 1286--1295.
[33]
Shan You, Chang Xu, Yunhe Wang, Chao Xu, and Dacheng Tao. 2016. Streaming Label Learning for Modeling Labels on the Fly. (2016). arXiv: arXiv:1604.05449
[34]
Ziming Zhang and Venkatesh Saligrama. 2015. Zero-Shot Learning via Semantic Similarity Embedding. In ICCV. 4166--4174.
[35]
Ziming Zhang and Venkatesh Saligrama. 2016. Zero-Shot Learning via Joint Latent Similarity Embedding. In CVPR. 2124--2132.
[36]
Ziming Zhang and Venkatesh Saligrama. 2016. Zero-Shot Recognition via Structured Prediction. In ECCV. 533--548.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMR '17: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval
June 2017
524 pages
ISBN:9781450347013
DOI:10.1145/3078971
  • General Chairs:
  • Bogdan Ionescu,
  • Nicu Sebe,
  • Program Chairs:
  • Jiashi Feng,
  • Martha Larson,
  • Rainer Lienhart,
  • Cees Snoek
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 June 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. manifold learning
  2. matrix factorization
  3. transductive learning
  4. zero-shot learning

Qualifiers

  • Research-article

Conference

ICMR '17
Sponsor:

Acceptance Rates

ICMR '17 Paper Acceptance Rate 33 of 95 submissions, 35%;
Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Deep Emotions Recognition from Facial Expressions using Deep LearningVFAST Transactions on Software Engineering10.21015/vtse.v11i2.150111:2(58-69)Online publication date: 19-Jun-2023
  • (2022)How robust are discriminatively trained zero-shot learning models?Image and Vision Computing10.1016/j.imavis.2022.104392119:COnline publication date: 1-Mar-2022
  • (2020)A Deep Dive into Adversarial Robustness in Zero-Shot LearningComputer Vision – ECCV 2020 Workshops10.1007/978-3-030-66415-2_1(3-21)Online publication date: 23-Aug-2020
  • (2019)A Survey of Zero-Shot LearningACM Transactions on Intelligent Systems and Technology10.1145/329331810:2(1-37)Online publication date: 16-Jan-2019
  • (2019)Zero-Shot Learning Using Stacked Autoencoder with Manifold Regularizations2019 IEEE International Conference on Image Processing (ICIP)10.1109/ICIP.2019.8803509(3651-3655)Online publication date: Sep-2019
  • (2019)Transductive Learning for Zero-Shot Object Detection2019 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV.2019.00618(6081-6090)Online publication date: Oct-2019
  • (2019)Generalized zero-shot learning for action recognition with web-scale video dataWorld Wide Web10.1007/s11280-018-0642-622:2(807-824)Online publication date: 1-Mar-2019
  • (2018)Pseudo Transfer with Marginalized Corrupted Attribute for Zero-shot LearningProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240715(1802-1810)Online publication date: 15-Oct-2018
  • (2018)Discriminative and Orthogonal Subspace Constraints-Based Nonnegative Matrix FactorizationACM Transactions on Intelligent Systems and Technology10.1145/32290519:6(1-24)Online publication date: 1-Nov-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media