research-article

Deep Heterogeneous Multi-Task Metric Learning for Visual Recognition and Retrieval

Authors:

Han HuAuthors Info & Claims

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Pages 1837 - 1845

https://doi.org/10.1145/3394171.3413574

Published: 12 October 2020 Publication History

Get Access

Abstract

How to estimate the distance between data instances is a fundamental problem in many artificial intelligence algorithms, and critical in diverse multimedia applications. A major challenge in the estimation is how to find an appropriate distance function when labeled data are insufficient for a certain task. Multi-task metric learning (MTML) is able to alleviate such data deficiency issue by learning distance metrics for multiple tasks together and sharing information between the different tasks. Recently, heterogeneous MTML (HMTML) has attracted much attention since it can handle multiple tasks with varied data representations. A major drawback of the current HMTML approaches is that only linear transformations are learned to connect different domains. This is suboptimal since the correlations between different domains may be very complex and highly nonlinear. To overcome this drawback, we propose a deep heterogeneous MTML (DHMTML) method, in which a nonlinear mapping is learned for each task by using a deep neural network. The correlations of different domains are exploited by sharing some parameters at the top layers of different networks. More importantly, the auto-encoder scheme and the adversarial learning mechanism are integrated and incorporated to help exploit the feature correlations in and between different tasks and the specific properties are preserved by learning additional task-specific layers together with the common layers. Experiments demonstrated that the proposed method outperforms single-task deep metric learning algorithms and other HMTML approaches consistently on several benchmark datasets.

Supplementary Material

MP4 File (3394171.3413574.mp4)

In this video, we present a novel deep heterogeneous multi-task metric learning framework, which is able to learn multiple nonlinear distance metrics simultaneously and enable information transfer between the different tasks/domains effectively. Specifically, a nonlinear metric is learned for each task using neural network, and we enforce the different networks to share some top layers to enable information transfer. Some specific representations are learned together with the common representation to respect the specific properties. We also introduce the auto-encoder scheme to exploit some interesting structures, such as feature correlations, contained in and between different domains. Another major contribution is that we introduce adversarial learning to enforce different domains not only share the same features, but also follow the same data distribution in the common subspace. We demonstrate effectiveness of our method in both toy face recognition and natural image clustering and retrieval.

Download
22.34 MB

References

[1]

Martin Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein gan. arXiv preprint arXiv:1701.07875 (2017).

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Metric-Guided Multi-task Learning

Transfer metric learning by learning task relationships

Multi-task Sparse Regression Metric Learning for Heterogeneous Classification

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations