Computer Science > Computer Vision and Pattern Recognition

arXiv:2203.16015 (cs)

[Submitted on 30 Mar 2022]

Title:ITTR: Unpaired Image-to-Image Translation with Transformers

Authors:Wanfeng Zheng, Qiang Li, Guoxin Zhang, Pengfei Wan, Zhongyuan Wang

View PDF

Abstract:Unpaired image-to-image translation is to translate an image from a source domain to a target domain without paired training data. By utilizing CNN in extracting local semantics, various techniques have been developed to improve the translation performance. However, CNN-based generators lack the ability to capture long-range dependency to well exploit global semantics. Recently, Vision Transformers have been widely investigated for recognition tasks. Though appealing, it is inappropriate to simply transfer a recognition-based vision transformer to image-to-image translation due to the generation difficulty and the computation limitation. In this paper, we propose an effective and efficient architecture for unpaired Image-to-Image Translation with Transformers (ITTR). It has two main designs: 1) hybrid perception block (HPB) for token mixing from different receptive fields to utilize global semantics; 2) dual pruned self-attention (DPSA) to sharply reduce the computational complexity. Our ITTR outperforms the state-of-the-arts for unpaired image-to-image translation on six benchmark datasets.

Comments:	18 pages, 7 figures, 5 tables
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2203.16015 [cs.CV]
	(or arXiv:2203.16015v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2203.16015

Submission history

From: Qiang Li Capasso [view email]
[v1] Wed, 30 Mar 2022 02:46:12 UTC (3,758 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:ITTR: Unpaired Image-to-Image Translation with Transformers

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:ITTR: Unpaired Image-to-Image Translation with Transformers

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators