Computer Science > Machine Learning

arXiv:2204.09268 (cs)

[Submitted on 20 Apr 2022]

Title:Uncertainty-based Cross-Modal Retrieval with Probabilistic Representations

Authors:Leila Pishdad, Ran Zhang, Konstantinos G. Derpanis, Allan Jepson, Afsaneh Fazly

View PDF

Abstract:Probabilistic embeddings have proven useful for capturing polysemous word meanings, as well as ambiguity in image matching. In this paper, we study the advantages of probabilistic embeddings in a cross-modal setting (i.e., text and images), and propose a simple approach that replaces the standard vector point embeddings in extant image-text matching models with probabilistic distributions that are parametrically learned. Our guiding hypothesis is that the uncertainty encoded in the probabilistic embeddings captures the cross-modal ambiguity in the input instances, and that it is through capturing this uncertainty that the probabilistic models can perform better at downstream tasks, such as image-to-text or text-to-image retrieval. Through extensive experiments on standard and new benchmarks, we show a consistent advantage for probabilistic representations in cross-modal retrieval, and validate the ability of our embeddings to capture uncertainty.

Comments:	13 pages, 7 figures
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)
Cite as:	arXiv:2204.09268 [cs.LG]
	(or arXiv:2204.09268v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2204.09268

Submission history

From: Ran Zhang [view email]
[v1] Wed, 20 Apr 2022 07:24:20 UTC (20,633 KB)

Computer Science > Machine Learning

Title:Uncertainty-based Cross-Modal Retrieval with Probabilistic Representations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Uncertainty-based Cross-Modal Retrieval with Probabilistic Representations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators