research-article

Transductive Visual-Semantic Embedding for Zero-shot Learning

Authors:

Zi HuangAuthors Info & Claims

ICMR '17: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval

Pages 41 - 49

https://doi.org/10.1145/3078971.3078977

Published: 06 June 2017 Publication History

Abstract

Zero-shot learning (ZSL) aims to bridge the knowledge transfer via available semantic representations (e.g., attributes) between labeled source instances of seen classes and unlabelled target instances of unseen classes. Most existing ZSL approaches achieve this by learning a projection from the visual feature space to the semantic representation space based on the source instances, and directly applying it to the target instances. However, the intrinsic manifold structures residing in both semantic representations and visual features are not effectively incorporated into the learned projection function. Moreover, these methods may suffer from the inherent projection shift problem, due to the disjointness between seen and unseen classes. To overcome these drawbacks, we propose a novel framework termed transductive visual-semantic embedding (TVSE) for ZSL. In specific, TVSE first learns a latent embedding space to incorporate the manifold structures in both labeled source instances and unlabeled target instances under the transductive setting. In the learned space, each instance is viewed as a mixture of seen class scores. TVSE then effectively constructs the relational mapping between seen and unseen classes using the available semantic representations, and applies it to map the seen class scores of the target instances to their predictions of unseen classes. Extensive experiments on four benchmark datasets demonstrate that the proposed TVSE achieves competitive performance compared with the state-of-the-arts for zero-shot recognition and retrieval tasks.

References

[1]

Zeynep Akata, Florent Perronnin, Zaïd Harchaoui, and Cordelia Schmid. 2013. Label-Embedding for Attribute-Based Classification. In CVPR. 819--826.

Digital Library

[2]

Zeynep Akata, Scott E. Reed, Daniel Walter, Honglak Lee, and Bernt Schiele. 2015. Evaluation of output embeddings for negrained image classification. In CVPR. 2927--2936.

[3]

Stephen Boyd and Lieven Vandenberghe. 2004. Convex Optimization.

Digital Library

[4]

Deng Cai, Xiaofei He, Xuanhui Wang, Hujun Bao, and Jiawei Han. 2009. Locality Preserving Nonnegative Matrix Factorization. In IJCAI. 1010--1015.

Digital Library

[5]

Soravit Changpinyo, Wei-Lun Chao, Boqing Gong, and Fei Sha. 2016. Synthesized Classifiers for Zero-Shot Learning. In CVPR. 580--587.

[6]

Ali Farhadi, Ian Endres, Derek Hoiem, and David A. Forsyth. 2009. Describing objects by their attributes. In CVPR. 1778--1785.

[7]

Andrea Frome, Gregory S. Corrado, Jonathon Shlens, Samy Bengio, Jeffrey Dean, Marc'Aurelio Ranzato, and Tomas Mikolov. 2013. DeViSE: A Deep Visual-Semantic Embedding Model. In NIPS. 2121--2129.

Digital Library

[8]

Yanwei Fu, Timothy M. Hospedales, Tao Xiang, Zhen-Yong Fu, and Shaogang Gong. 2014. Transductive Multi-view Embedding for Zero-Shot Recognition and Annotation. In ECCV. 584--599.

[9]

Yanwei Fu, Timothy M. Hospedales, Tao Xiang, and Shaogang Gong. 2015. Transductive Multi-View Zero-Shot Learning. IEEE Trans. Pattern Anal. Mach. Intell. 37, 11 (2015), 2332--2345.

Digital Library

[10]

Zhen-Yong Fu, Tao A. Xiang, Elyor Kodirov, and Shaogang Gong. 2015. Zero-shot object recognition by semantic manifold distance. In CVPR. 2635--2644.

[11]

Chuang Gan, Tianbao Yang, and Boqing Gong. 2016. Learning Attributes Equals Multi-Source Domain Generalization. In CVPR. 87--97.

[12]

Elyor Kodirov, Tao Xiang, Zhen-Yong Fu, and Shaogang Gong. 2015. Unsupervised Domain Adaptation for Zero-Shot Learning. In ICCV. 2452--2460.

Digital Library

[13]

Christoph H. Lampert, Hannes Nickisch, and Stefan Harmeling. 2009. Learning to detect unseen object classes by between-class attribute transfer. In CVPR. 951--958.

[14]

Christoph H. Lampert, Hannes Nickisch, and Stefan Harmeling. 2014. Attribute-Based Classification for Zero-Shot Visual Object Categorization. IEEE Trans. Pattern Anal. Mach. Intell. 36, 3 (2014), 453--465.

Digital Library

[15]

Hugo Larochelle, Dumitru Erhan, and Yoshua Bengio. 2008. Zero-data Learning of New Tasks. In AAAI. 646--651.

Digital Library

[16]

Jingen Liu, Benjamin Kuipers, and Silvio Savarese. 2011. Recognizing human actions by attributes. In CVPR. 3337--3344.

Digital Library

[17]

Mingsheng Long, Jianmin Wang, Guiguang Ding, Dou Shen, and Qiang Yang. 2012. Transfer Learning with Graph Co-Regularization. In AAAI.

Digital Library

[18]

Thomas Mensink, Jakob Verbeek, Florent Perronnin, and Gabriela Csurka. 2012. Metric Learning for Large Scale Image Classifcation: Generalizing to New Classes at Near-Zero Cost. In ECCV. 488--501.

Digital Library

[19]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013).

[20]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Je rey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS. 3111--3119.

Digital Library

[21]

Mohammad Norouzi, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea Frome, Greg Corrado, and Je rey Dean. 2013. Zero-Shot Learning by Convex Combination of Semantic Embeddings. NIPS (2013), 410--418.

[22]

Mark Palatucci, Dean Pomerleau, Geo rey E. Hinton, and Tom M. Mitchell. 2009. Zero-shot Learning with Semantic Output Codes. In NIPS. 1410--1418.

Digital Library

[23]

Genevieve Patterson, Chen Xu, Hang Su, and James Hays. 2014. The SUN Attribute Database: Beyond Categories for Deeper Scene Understanding. International Journal of Computer Vision 108, 1--2 (2014), 59--81.

Digital Library

[24]

Marcus Rohrbach, Sandra Ebert, and Bernt Schiele. 2013. Transfer Learning in a Transductive Setting. In NIPS. 46--54.

Digital Library

[25]

Bernardino Romera-Paredes and Philip H. S. Torr. 2015. An embarrassingly simple approach to zero-shot learning. In ICML. 2152--2161.

Digital Library

[26]

Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR abs/1409.1556 (2014).

[27]

Richard Socher, Milind Ganjoo, Christopher D. Manning, and Andrew Y. Ng. 2013. Zero-Shot Learning Through Cross-Modal Transfer. In NIPS. 935--943.

Digital Library

[28]

Laurens van der Maaten and Geo rey E. Hinton. 2008. Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9 (2008), 2579--2605.

[29]

C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. 2011. The Caltech-UCSD Birds-200--2011 Dataset. Technical Report CNS-TR-2011-001. California Institute of Technology.

[30]

Donghui Wang, Yanan Li, Yuetan Lin, and Yueting Zhuang. 2016. Relational Knowledge Transfer for Zero-Shot Learning. In AAAI. 2145--2151.

Digital Library

[31]

Xing Xu, Fumin Shen, Yang Yang, Dongxiang Zhang, Heng Tao Shen, and Jingkuan Song. 2017. Matrix Tri-Factorization with Manifold Regularizations for Zero-shot Learning. In CVPR.

[32]

Yang Yang, Yadan Luo, Weilun Chen, Fumin Shen, Jie Shao, and Heng Tao Shen. 2016. Zero-Shot Hashing via Transferring Supervised Knowledge. In ACM MM. 1286--1295.

Digital Library

[33]

Shan You, Chang Xu, Yunhe Wang, Chao Xu, and Dacheng Tao. 2016. Streaming Label Learning for Modeling Labels on the Fly. (2016). arXiv: arXiv:1604.05449

[34]

Ziming Zhang and Venkatesh Saligrama. 2015. Zero-Shot Learning via Semantic Similarity Embedding. In ICCV. 4166--4174.

Digital Library

[35]

Ziming Zhang and Venkatesh Saligrama. 2016. Zero-Shot Learning via Joint Latent Similarity Embedding. In CVPR. 2124--2132.

[36]

Ziming Zhang and Venkatesh Saligrama. 2016. Zero-Shot Recognition via Structured Prediction. In ECCV. 533--548.

Cited By

Shahzadi IFuzail MAslam D(2023)Deep Emotions Recognition from Facial Expressions using Deep LearningVFAST Transactions on Software Engineering10.21015/vtse.v11i2.150111:2(58-69)Online publication date: 19-Jun-2023
https://doi.org/10.21015/vtse.v11i2.1501
Yucel MCinbis RDuygulu P(2022)How robust are discriminatively trained zero-shot learning models?Image and Vision Computing10.1016/j.imavis.2022.104392119:COnline publication date: 1-Mar-2022
https://dl.acm.org/doi/10.1016/j.imavis.2022.104392
Yucel MCinbis RDuygulu P(2020)A Deep Dive into Adversarial Robustness in Zero-Shot LearningComputer Vision – ECCV 2020 Workshops10.1007/978-3-030-66415-2_1(3-21)Online publication date: 23-Aug-2020
https://dl.acm.org/doi/10.1007/978-3-030-66415-2_1
Show More Cited By

Index Terms

Transductive Visual-Semantic Embedding for Zero-shot Learning
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Multi-task learning
        Transfer learning
      2. Supervised learning
        Supervised learning by classification

Recommendations

Transductive Multilabel Learning via Label Set Propagation

The problem of multilabel classification has attracted great interest in the last decade, where each instance can be assigned with a set of multiple class labels simultaneously. It has a wide variety of real-world applications, e.g., automatic image ...
Consistency-guided pseudo labeling for transductive zero-shot learning
Abstract
Zero-shot learning (ZSL) aims to recognize unseen classes during training. Transductive methods have advanced in ZSL, however, often rely on pseudo labels based on confidence scores, leading to semantic misalignment between unseen-class image ...
Cross-domain mapping learning for transductive zero-shot learning
Abstract
Zero-shot learning (ZSL) aims to learn a projection function from a visual feature space to a semantic embedding space or reverse. The main challenge of ZSL is the domain shift problem where the unseen test data has a large gap with ...
Highlights
- Our general algorithm can extend inductive ZSL methods to transductive scenarios.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '17: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval

June 2017

524 pages

ISBN:9781450347013

DOI:10.1145/3078971

General Chairs:
Bogdan Ionescu
University Politehnica of Bucharest, Romania
,
Nicu Sebe
University of Trento, Italy
,
Program Chairs:
Jiashi Feng
National University of Singapore, Singapore
,
Martha Larson
Radboud University & Delft University of Technology, The Netherlands
,
Rainer Lienhart
University of Augsburg, Germany
,
Cees Snoek
University of Amsterdam & Qualcomm Research Netherlands, The Netherlands

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 June 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICMR '17

Sponsor:

SIGMM

ICMR '17: International Conference on Multimedia Retrieval

June 6 - 9, 2017

Bucharest, Romania

Acceptance Rates

ICMR '17 Paper Acceptance Rate 33 of 95 submissions, 35%;

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
406
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Shahzadi IFuzail MAslam D(2023)Deep Emotions Recognition from Facial Expressions using Deep LearningVFAST Transactions on Software Engineering10.21015/vtse.v11i2.150111:2(58-69)Online publication date: 19-Jun-2023
https://doi.org/10.21015/vtse.v11i2.1501
Yucel MCinbis RDuygulu P(2022)How robust are discriminatively trained zero-shot learning models?Image and Vision Computing10.1016/j.imavis.2022.104392119:COnline publication date: 1-Mar-2022
https://dl.acm.org/doi/10.1016/j.imavis.2022.104392
Yucel MCinbis RDuygulu P(2020)A Deep Dive into Adversarial Robustness in Zero-Shot LearningComputer Vision – ECCV 2020 Workshops10.1007/978-3-030-66415-2_1(3-21)Online publication date: 23-Aug-2020
https://dl.acm.org/doi/10.1007/978-3-030-66415-2_1
Wang WZheng VYu HMiao C(2019)A Survey of Zero-Shot LearningACM Transactions on Intelligent Systems and Technology10.1145/329331810:2(1-37)Online publication date: 16-Jan-2019
https://dl.acm.org/doi/10.1145/3293318
Song JShi GXie XGao D(2019)Zero-Shot Learning Using Stacked Autoencoder with Manifold Regularizations2019 IEEE International Conference on Image Processing (ICIP)10.1109/ICIP.2019.8803509(3651-3655)Online publication date: Sep-2019
https://doi.org/10.1109/ICIP.2019.8803509
Rahman SKhan SBarnes N(2019)Transductive Learning for Zero-Shot Object Detection2019 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV.2019.00618(6081-6090)Online publication date: Oct-2019
https://doi.org/10.1109/ICCV.2019.00618
Liu KLiu WMa HHuang WDong X(2019)Generalized zero-shot learning for action recognition with web-scale video dataWorld Wide Web10.1007/s11280-018-0642-622:2(807-824)Online publication date: 1-Mar-2019
https://dl.acm.org/doi/10.1007/s11280-018-0642-6
Long TXu XLi YShen FSong JShen HBoll SMu Lee KLuo JZhu WByun HWen Chen CLienhart RMei T(2018)Pseudo Transfer with Marginalized Corrupted Attribute for Zero-shot LearningProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240715(1802-1810)Online publication date: 15-Oct-2018
https://dl.acm.org/doi/10.1145/3240508.3240715
Li XCui GDong Y(2018)Discriminative and Orthogonal Subspace Constraints-Based Nonnegative Matrix FactorizationACM Transactions on Intelligent Systems and Technology10.1145/32290519:6(1-24)Online publication date: 1-Nov-2018
https://dl.acm.org/doi/10.1145/3229051

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents