Computer Science > Computer Vision and Pattern Recognition

arXiv:2303.05692 (cs)

[Submitted on 10 Mar 2023]

Title:Semantic-Preserving Augmentation for Robust Image-Text Retrieval

Authors:Sunwoo Kim, Kyuhong Shim, Luong Trung Nguyen, Byonghyo Shim

View PDF

Abstract:Image text retrieval is a task to search for the proper textual descriptions of the visual world and vice versa. One challenge of this task is the vulnerability to input image and text corruptions. Such corruptions are often unobserved during the training, and degrade the retrieval model decision quality substantially. In this paper, we propose a novel image text retrieval technique, referred to as robust visual semantic embedding (RVSE), which consists of novel image-based and text-based augmentation techniques called semantic preserving augmentation for image (SPAugI) and text (SPAugT). Since SPAugI and SPAugT change the original data in a way that its semantic information is preserved, we enforce the feature extractors to generate semantic aware embedding vectors regardless of the corruption, improving the model robustness significantly. From extensive experiments using benchmark datasets, we show that RVSE outperforms conventional retrieval schemes in terms of image-text retrieval performance.

Comments:	Accepted to ICASSP 2023
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2303.05692 [cs.CV]
	(or arXiv:2303.05692v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2303.05692

Submission history

From: Sunwoo Kim [view email]
[v1] Fri, 10 Mar 2023 03:50:44 UTC (2,877 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Semantic-Preserving Augmentation for Robust Image-Text Retrieval

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Semantic-Preserving Augmentation for Robust Image-Text Retrieval

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators