Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Transform, Warp, and Dress: A New Transformation-guided Model for Virtual Try-on

Published: 16 February 2022 Publication History

Abstract

Virtual try-on has recently emerged in computer vision and multimedia communities with the development of architectures that can generate realistic images of a target person wearing a custom garment. This research interest is motivated by the large role played by e-commerce and online shopping in our society. Indeed, the virtual try-on task can offer many opportunities to improve the efficiency of preparing fashion catalogs and to enhance the online user experience. The problem is far to be solved: current architectures do not reach sufficient accuracy with respect to manually generated images and can only be trained on image pairs with a limited variety. Existing virtual try-on datasets have two main limits: they contain only female models, and all the images are available only in low resolution. This not only affects the generalization capabilities of the trained architectures but makes the deployment to real applications impractical. To overcome these issues, we present Dress Code, a new dataset for virtual try-on that contains high-resolution images of a large variety of upper-body clothes and both male and female models. Leveraging this enriched dataset, we propose a new model for virtual try-on capable of generating high-quality and photo-realistic images using a three-stage pipeline. The first two stages perform two different geometric transformations to warp the desired garment and make it fit into the target person’s body pose and shape. Then, we generate the new image of that same person wearing the try-on garment using a generative network. We test the proposed solution on the most widely used dataset for this task as well as on our newly collected dataset and demonstrate its effectiveness when compared to current state-of-the-art methods. Through extensive analyses on our Dress Code dataset, we show the adaptability of our model, which can generate try-on images even with a higher resolution.

References

[1]
Kumar Ayush, Surgan Jandial, Ayush Chopra, Mayur Hemani, and Balaji Krishnamurthy. 2019. Robust cloth warping via multi-scale patch adversarial loss for virtual try-on framework. In Proceedings of the ICCV Workshops.
[2]
Shane Barratt and Rishi Sharma. 2018. A note on the inception score. In Proceedings of the ICML Workshops.
[3]
Hugo Bertiche, Meysam Madadi, and Sergio Escalera. 2020. CLOTH3D: Clothed 3D humans. In Proceedings of the ECCV.
[4]
Mikołaj Bińkowski, Dougal J. Sutherland, Michael Arbel, and Arthur Gretton. 2018. Demystifying MMD GANs. In Proceedings of the ICLR.
[5]
Fred L. Bookstein. 1989. Principal warps: Thin-plate splines and the decomposition of deformations. IEEE Trans. PAMI 11, 6 (1989), 567–585.
[6]
Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime multi-person 2D pose estimation using part affinity fields. In Proceedings of the CVPR.
[7]
Guillem Cucurull, Perouz Taslakian, and David Vazquez. 2019. Context-aware visual compatibility prediction. In Proceedings of the CVPR.
[8]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the CVPR.
[9]
Haoye Dong, Xiaodan Liang, Ke Gong, Hanjiang Lai, Jia Zhu, and Jian Yin. 2018. Soft-gated warping-GAN for pose-guided person image synthesis. In Proceedings of the NeurIPS.
[10]
Haoye Dong, Xiaodan Liang, Xiaohui Shen, Bochao Wang, Hanjiang Lai, Jia Zhu, Zhiting Hu, and Jian Yin. 2019. Towards multi-pose guided virtual try-on network. In Proceedings of the ICCV.
[11]
Haoye Dong, Xiaodan Liang, Xiaohui Shen, Bowen Wu, Bing-Cheng Chen, and Jian Yin. 2019. FW-GAN: Flow-navigated warping GAN for video virtual try-on. In Proceedings of the ICCV.
[12]
Xue Dong, Jianlong Wu, Xuemeng Song, Hongjun Dai, and Liqiang Nie. 2020. Fashion compatibility modeling through a multi-modal try-on-guided scheme. In Proceedings of the ACM SIGIR.
[13]
Matteo Fincato, Federico Landi, Marcella Cornia, Cesari Fabio, and Rita Cucchiara. 2020. VITON-GT: An image-based virtual try-on model with geometric transformations. In Proceedings of the ICPR.
[14]
Yuying Ge, Ruimao Zhang, Xiaogang Wang, Xiaoou Tang, and Ping Luo. 2019. DeepFashion2: A versatile benchmark for detection, pose estimation, segmentation and re-identification of clothing images. In Proceedings of the CVPR.
[15]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Proceedings of the NeurIPS.
[16]
Peng Guan, Loretta Reiss, David A. Hirshberg, Alexander Weiss, and Michael J. Black. 2012. Drape: Dressing any person. ACM Trans. Graph. 31, 4 (2012), 1–10.
[17]
Rıza Alp Güler, Natalia Neverova, and Iasonas Kokkinos. 2018. DensePose: Dense human pose estimation in the wild. In Proceedings of the CVPR.
[18]
M. Hadi Kiapour, Xufeng Han, Svetlana Lazebnik, Alexander C. Berg, and Tamara L. Berg. 2015. Where to buy it: Matching street clothing photos in online shops. In Proceedings of the ICCV.
[19]
Fabian Hahn, Bernhard Thomaszewski, Stelian Coros, Robert W. Sumner, Forrester Cole, Mark Meyer, Tony DeRose, and Markus Gross. 2014. Subspace clothing simulation using adaptive bases. ACM Trans. Graph. 33, 4 (2014), 1–9.
[20]
Xintong Han, Xiaojun Hu, Weilin Huang, and Matthew R. Scott. 2019. ClothFlow: A flow-based model for clothed person generation. In Proceedings of the ICCV.
[21]
Xintong Han, Zuxuan Wu, Zhe Wu, Ruichi Yu, and Larry S. Davis. 2018. VITON: An image-based virtual try-on network. In Proceedings of the CVPR.
[22]
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, Günter Klambauer, and Sepp Hochreiter. 2017. GANs trained by a two time-scale update rule converge to a nash equilibrium. In Proceedings of the NeurIPS.
[23]
Wei-Lin Hsiao and Kristen Grauman. 2018. Creating capsule wardrobes from fashion images. In Proceedings of the CVPR.
[24]
Chia-Wei Hsieh, Chieh-Yun Chen, Chien-Lung Chou, Hong-Han Shuai, and Wen-Huang Cheng. 2019. Fit-me: Image-based virtual try-on with arbitrary poses. In Proceedings of the ICIP.
[25]
Chia-Wei Hsieh, Chieh-Yun Chen, Chien-Lung Chou, Hong-Han Shuai, Jiaying Liu, and Wen-Huang Cheng. 2019. FashionOn: Semantic-guided image-based virtual try-on with detailed human and clothing information. In Proceedings of the ACM Multimedia.
[26]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the CVPR.
[27]
Thibaut Issenhuth, Jérémie Mary, and Clément Calauzènes. 2020. Do not mask what you do not need to mask: A parser-free virtual try-on. In Proceedings of the ECCV.
[28]
Hyug Jae Lee, Rokkyu Lee, Minseok Kang, Myounghoon Cho, and Gunhan Park. 2019. LA-VITON: A network for looking-attractive virtual try-on. In Proceedings of the ICCV Workshops.
[29]
Surgan Jandial, Ayush Chopra, Kumar Ayush, Mayur Hemani, Balaji Krishnamurthy, and Abhijeet Halwai. 2020. SieveNet: A unified framework for robust image-based virtual try-on. In Proceedings of the WACV.
[30]
Nikolay Jetchev and Urs Bergmann. 2017. The conditional analogy GAN: Swapping fashion articles on people images. In Proceedings of the ICCV Workshops.
[31]
Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the ECCV.
[32]
Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the CVPR.
[33]
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and improving the image quality of StyleGAN. In Proceedings of the CVPR.
[34]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the ICLR.
[35]
Zhanghui Kuang, Yiming Gao, Guanbin Li, Ping Luo, Yimin Chen, Liang Lin, and Wayne Zhang. 2019. Fashion retrieval via graph reasoning networks on a similarity pyramid. In Proceedings of the ICCV.
[36]
Gaurav Kuppa, Andrew Jong, Xin Liu, Ziwei Liu, and Teng-Sheng Moh. 2021. ShineOn: Illuminating design choices for practical video-based virtual clothing try-on. In Proceedings of the WACV Workshops.
[37]
Kathleen M. Lewis, Srivatsan Varadharajan, and Ira Kemelmacher-Shlizerman. 2021. VOGUE: Try-on by stylegan interpolation optimization. Retrieved from https://arXiv:2101.02285.
[38]
Peike Li, Yunqiu Xu, Yunchao Wei, and Yi Yang. 2019. Self-correction for human parsing. Retrieved from https://arXiv:1910.09777.
[39]
Xiaodan Liang, Si Liu, Xiaohui Shen, Jianchao Yang, Luoqi Liu, Jian Dong, Liang Lin, and Shuicheng Yan. 2015. Deep human parsing with active template regression. IEEE Trans. PAMI 37, 12 (2015), 2402–2414.
[40]
Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, and Xiaoou Tang. 2016. DeepFashion: Powering robust clothes recognition and retrieval with rich annotations. In Proceedings of the CVPR.
[41]
Dominik Lorenz, Leonard Bereska, Timo Milbich, and Bjorn Ommer. 2019. Unsupervised part-based disentangling of object shape and appearance. In Proceedings of the CVPR.
[42]
Liqian Ma, Qianru Sun, Stamatios Georgoulis, Luc Van Gool, Bernt Schiele, and Mario Fritz. 2018. Disentangled person image generation. In Proceedings of the CVPR.
[43]
Qianli Ma, Jinlong Yang, Anurag Ranjan, Sergi Pujades, Gerard Pons-Moll, Siyu Tang, and Michael J. Black. 2020. Learning to dress 3D people in generative clothing. In Proceedings of the CVPR.
[44]
Marco Manfredi, Costantino Grana, Simone Calderara, and Rita Cucchiara. 2014. A complete system for garment segmentation and color classification. Mach. Vision Appl. 25, 4 (2014), 955–969.
[45]
Matiur Rahman Minar and Heejune Ahn. 2020. CloTH-VTON: Clothing three-dimensional reconstruction for hybrid image-based virtual try-ON. In Proceedings of the ACCV.
[46]
Matiur Rahman Minar, Thai Thanh Tuan, Heejune Ahn, Paul Rosin, and Yu-Kun Lai. 2020. CP-VTON+: Clothing shape and texture preserving image-based virtual try-on. In Proceedings of the CVPR Workshops.
[47]
Aymen Mir, Thiemo Alldieck, and Gerard Pons-Moll. 2020. Learning to transfer texture from clothing images to 3d humans. In Proceedings of the CVPR.
[48]
Davide Morelli, Marcella Cornia, and Rita Cucchiara. 2021. FashionSearch++: Improving consumer-to-shop clothes retrieval with hard negatives. In Proceedings of the Italian Information Retrieval Workshop.
[49]
Assaf Neuberger, Eran Borenstein, Bar Hilleli, Eduard Oks, and Sharon Alpert. 2020. Image based virtual try-on network from unpaired data. In Proceedings of the CVPR.
[50]
James Philbin, Ondrej Chum, Michael Isard, Josef Sivic, and Andrew Zisserman. 2007. Object retrieval with large vocabularies and fast spatial matching. In Proceedings of the CVPR.
[51]
Gerard Pons-Moll, Sergi Pujades, Sonny Hu, and Michael J. Black. 2017. ClothCap: Seamless 4D clothing capture and retargeting. ACM Trans. Graph. 36, 4 (2017), 1–15.
[52]
Amir Hossein Raffiee and Michael Sollami. 2020. GarmentGAN: Photo-realistic adversarial fashion transfer. Retrieved from https://arXiv:2003.01894.
[53]
Amit Raj, Patsorn Sangkloy, Huiwen Chang, Jingwan Lu, Duygu Ceylan, and James Hays. 2018. SwapNet: Image based garment transfer. In Proceedings of the ECCV.
[54]
Ignacio Rocco, Relja Arandjelovic, and Josef Sivic. 2017. Convolutional neural network architecture for geometric matching. In Proceedings of the CVPR.
[55]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the MICCAI.
[56]
Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training GANs. In Proceedings of the NeurIPS.
[57]
Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of the ICLR.
[58]
Xuemeng Song, Liqiang Nie, and Yinglong Wang. 2019. Compatibility modeling: Data and knowledge applications for clothing matching. Synth. Lect. Info. Conc. Retriev. Serv. 11, 3 (2019), 1–138.
[59]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the CVPR.
[60]
Garvita Tiwari, Bharat Lal Bhatnagar, Tony Tung, and Gerard Pons-Moll. 2020. SIZER: A dataset and model for parsing 3D clothing and learning size sensitive 3D clothing. In Proceedings of the ECCV.
[61]
Mariya I. Vasileva, Bryan A. Plummer, Krishna Dusad, Shreya Rajpal, Ranjitha Kumar, and David Forsyth. 2018. Learning type-aware embeddings for fashion compatibility. In Proceedings of the ECCV.
[62]
Bochao Wang, Huabin Zheng, Xiaodan Liang, Yimin Chen, Liang Lin, and Meng Yang. 2018. Toward characteristic-preserving image-based virtual try-on network. In Proceedings of the ECCV.
[63]
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. High-resolution image synthesis and semantic manipulation with conditional GANs. In Proceedings of the CVPR.
[64]
Wenguan Wang, Yuanlu Xu, Jianbing Shen, and Song-Chun Zhu. 2018. Attentive fashion grammar network for fashion landmark detection and clothing category classification. In Proceedings of the CVPR.
[65]
Zhonghua Wu, Guosheng Lin, Qingyi Tao, and Jianfei Cai. 2019. M2E-try on net: Fashion from model to everyone. In Proceedings of the ACM Multimedia.
[66]
Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms. Retrieved from https://arXiv:1708.07747.
[67]
Kota Yamaguchi, M. Hadi Kiapour, and Tamara L. Berg. 2013. Paper doll parsing: Retrieving similar styles to parse clothing items. In Proceedings of the ICCV.
[68]
Han Yang, Ruimao Zhang, Xiaobao Guo, Wei Liu, Wangmeng Zuo, and Ping Luo. 2020. Towards photo-realistic virtual try-on by adaptively generating-preserving image content. In Proceedings of the CVPR.
[69]
Gokhan Yildirim, Nikolay Jetchev, Roland Vollgraf, and Urs Bergmann. 2019. Generating high-resolution fashion model images wearing custom outfits. In Proceedings of the ICCV Workshops.
[70]
Ruiyun Yu, Xiaoqi Wang, and Xiaohui Xie. 2019. VTNFP: An image-based virtual try-on network with body and clothing feature preservation. In Proceedings of the ICCV.
[71]
Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the CVPR.
[72]
Heming Zhu, Yu Cao, Hang Jin, Weikai Chen, Dong Du, Zhangye Wang, Shuguang Cui, and Xiaoguang Han. 2020. Deep Fashion3D: A dataset and benchmark for 3D garment reconstruction from single images. In Proceedings of the ECCV.

Cited By

View all
  • (2024)Revolutionizing E-CommerceCreating AI Synergy Through Business Technology Transformation10.4018/979-8-3693-4187-2.ch006(115-136)Online publication date: 30-Aug-2024
  • (2024)KF-VTON: Keypoints-Driven Flow Based Virtual Try-On NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/367390320:9(1-23)Online publication date: 19-Jun-2024
  • (2024)Appearance and Pose-guided Human Generation: A SurveyACM Computing Surveys10.1145/363706056:5(1-35)Online publication date: 12-Jan-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 2
May 2022
494 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/3505207
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 February 2022
Accepted: 01 August 2021
Revised: 01 July 2021
Received: 01 March 2021
Published in TOMM Volume 18, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Virtual try-on
  2. geometric transformations
  3. generative adversarial networks

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • “SUPER—Supercomputing Unified Platform”

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)205
  • Downloads (Last 6 weeks)14
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Revolutionizing E-CommerceCreating AI Synergy Through Business Technology Transformation10.4018/979-8-3693-4187-2.ch006(115-136)Online publication date: 30-Aug-2024
  • (2024)KF-VTON: Keypoints-Driven Flow Based Virtual Try-On NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/367390320:9(1-23)Online publication date: 19-Jun-2024
  • (2024)Appearance and Pose-guided Human Generation: A SurveyACM Computing Surveys10.1145/363706056:5(1-35)Online publication date: 12-Jan-2024
  • (2024)Limb-Aware Virtual Try-On Network With Progressive Clothing WarpingIEEE Transactions on Multimedia10.1109/TMM.2023.328627826(1731-1746)Online publication date: 1-Jan-2024
  • (2023)Artificial Intelligence in Business-to-Customer Fashion Retail: A Literature ReviewMathematics10.3390/math1113294311:13(2943)Online publication date: 30-Jun-2023
  • (2023)Cloth Interactive Transformer for Virtual Try-OnACM Transactions on Multimedia Computing, Communications, and Applications10.1145/361737420:4(1-20)Online publication date: 11-Dec-2023
  • (2023)Self-Adaptive Clothing Mapping Based Virtual Try-onACM Transactions on Multimedia Computing, Communications, and Applications10.1145/361345320:3(1-26)Online publication date: 23-Oct-2023
  • (2023)A Multi-Level Consistency Network for High-Fidelity Virtual Try-OnACM Transactions on Multimedia Computing, Communications, and Applications10.1145/358050019:5(1-18)Online publication date: 16-Mar-2023
  • (2023)High-Fidelity Face Reenactment Via Identity-Matched Correspondence LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357185719:3(1-23)Online publication date: 25-Feb-2023
  • (2023)Virtual Footwear Try-On in Augmented Reality Using Deep Learning ModelsJournal of Computing and Information Science in Engineering10.1115/1.406259624:3Online publication date: 9-Oct-2023
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media