Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

HAIGEN: Towards Human-AI Collaboration for Facilitating Creativity and Style Generation in Fashion Design

Published: 09 September 2024 Publication History

Abstract

The process of fashion design usually involves sketching, refining, and coloring, with designers drawing inspiration from various images to fuel their creative endeavors. However, conventional image search methods often yield irrelevant results, impeding the design process. Moreover, creating and coloring sketches can be time-consuming and demanding, acting as a bottleneck in the design workflow. In this work, we introduce HAIGEN (Human-AI Collaboration for GENeration), an efficient fashion design system for Human-AI collaboration developed to aid designers. Specifically, HAIGEN consists of four modules. T2IM, located in the cloud, generates reference inspiration images directly from text prompts. With three other modules situated locally, the I2SM batch generates the image material library into a certain designer-style sketch material library. The SRM recommends similar sketches in the generated library to designers for further refinement, and the STM colors the refined sketch according to the styles of inspiration images. Through our system, any designer can perform local personalized fine-tuning and leverage the powerful generation capabilities of large models in the cloud, streamlining the entire design development process. Given that our approach integrates both cloud and local model deployment schemes, it effectively safeguards design privacy by avoiding the need to upload personalized data from local designers. We validated the effectiveness of each module through extensive qualitative and quantitative experiments. User surveys also confirmed that HAIGEN offers significant advantages in design efficiency, positioning it as a new generation of aid-tool for designers.

Supplemental Material

MP4 File - Hunan University
Demo Video

References

[1]
Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Jorma Laaksonen, and Michael Felsberg. 2022. Doodleformer: Creative sketch drawing with transformers. In Computer Vision-ECCV 2022:17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XVII. Springer, 338--355.
[2]
John Canny. 1986. A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence 6 (1986), 679--698.
[3]
Chun-Fu Richard Chen, Quanfu Fan, and Rameswar Panda. 2021. Crossvit: Cross-attention multi-scale vision transformer for image classification. In Proceedings of the IEEE/CVF international conference on computer vision. 357--366.
[4]
Sungjae Cho, Yoonsu Kim, Jaewoong Jang, and Inseok Hwang. 2023. AI-to-Human Actuation: Boosting Unmodified AI's Robustness by Proactively Inducing Favorable Human Sensing Conditions. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 7, 1 (2023), 1--32.
[5]
Hanhui Deng, Jianan Jiang, Zhiwang Yu, Jinhui Ouyang, and Di Wu. 2024. CrossGAI: A Cross-Device Generative AI Framework for Collaborative Fashion Design. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 8, 1 (2024), 1--27.
[6]
Zheng Ding, Xuaner Zhang, Zhihao Xia, Lars Jebe, Zhuowen Tu, and Xiuming Zhang. 2023. DiffusionRig: Learning Personalized Priors for Facial Appearance Editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12736--12746.
[7]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).
[8]
Mayara Costa Figueiredo, Elizabeth Ankrah, Jacquelyn E Powell, Daniel A Epstein, and Yunan Chen. 2024. Powered by AI: Examining How AI Descriptions Influence Perceptions of Fertility Tracking Applications. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 7, 4 (2024), 1--24.
[9]
Songwei Ge, Vedanuj Goswami, C. Lawrence Zitnick, and Devi Parikh. 2020. Creative Sketch Generation. arXiv:2011.10039 [cs.CV]
[10]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. Advances in Neural Information Processing Systems 27 (2014).
[11]
David Ha and Douglas Eck. 2017. A neural representation of sketch drawings. arXiv preprint arXiv:1704.03477 (2017).
[12]
Kai Han, An Xiao, Enhua Wu, Jianyuan Guo, Chunjing Xu, and Yunhe Wang. 2021. Transformer in transformer. In NeurIPS.
[13]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[14]
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems 30 (2017).
[15]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in neural information processing systems 33 (2020), 6840--6851.
[16]
Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2021. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021).
[17]
Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE international conference on computer vision. 1501--1510.
[18]
Markus Ikeda, Fabian Widmoser, Gernot Stübl, Setareh Zafari, Andreas Sackl, and Andreas Pichler. 2023. An Interactive Workplace for Improving Human Robot Collaboration: Sketch Workpiece Interface for Fast Teaching (SWIFT). In Adjunct Proceedings of the 2023 ACM International Joint Conference on Pervasive and Ubiquitous Computing & the 2023 ACM International Symposium on Wearable Computing. 191--194.
[19]
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning. pmlr, 448--456.
[20]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1125--1134.
[21]
Jianan Jiang, Xinglin Li, Weiren Yu, and Di Wu. 2024. HAIFIT: Human-Centered AI for Fashion Image Translation. arXiv preprint arXiv:2403.08651 (2024).
[22]
Jianan Jiang, Di Wu, Zhilin Jiang, and Weiren Yu. 2024. Simple Yet Efficient: Towards Self-Supervised FG-SBIR with Unified Sample Feature Alignment. arXiv preprint arXiv:2406.11551 (2024).
[23]
Haojian Jin, Boyuan Guo, Rituparna Roychoudhury, Yaxing Yao, Swarun Kumar, Yuvraj Agarwal, and Jason I Hong. 2022. Exploring the needs of users for supporting privacy-protective behaviors in smart homes. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1--19.
[24]
Ben Jonson. 2005. Design ideation: the conceptual sketch in the digital age. Design studies 26, 6 (2005), 613--624.
[25]
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8110--8119.
[26]
Junho Kim, Minjae Kim, Hyeonwoo Kang, and Hee Kwang Lee. 2020. U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation. (2020).
[27]
Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).
[28]
Nupur Kumari, Bingliang Zhang, Richard Zhang, Eli Shechtman, and Jun-Yan Zhu. 2023. Multi-concept customization of text-to-image diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1931--1941.
[29]
Yao Li, Xianggang Yu, Xiaoguang Han, Nianjuan Jiang, Kui Jia, and Jiangbo Lu. 2020. A deep learning based interactive sketching system for fashion images design. arXiv preprint arXiv:2010.04413 (2020).
[30]
Bingchen Liu, Yizhe Zhu, Kunpeng Song, and Ahmed Elgammal. 2021. Self-supervised sketch-to-image synthesis. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 2073--2081.
[31]
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).
[32]
Zhijian Liu, Zhanghao Wu, Chuang Gan, Ligeng Zhu, and Song Han. 2020. DataMix: Efficient Privacy-Preserving Edge-Cloud Inference. In European Conference on Computer Vision. 578--595.
[33]
Midjourney. 2022. Midjourney. https://www.midjourney.com/home [Online; accessed August-2023].
[34]
Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014).
[35]
Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. 2018. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018).
[36]
M Prajwal, Ayush Raj, Sougata Sen, Snehanshu Saha, and Surjya Ghosh. 2023. Towards Efficient Emotion Self-report Collection Using Human-AI Collaboration: A Case Study on Smartphone Keyboard Interaction. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 7, 2 (2023), 1--23.
[37]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748--8763.
[38]
Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. 2021. Zero-shot text-to-image generation. In International Conference on Machine Learning. PMLR, 8821--8831.
[39]
Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. 2016. Generative adversarial text to image synthesis. In International conference on machine learning. PMLR, 1060--1069.
[40]
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10684--10695.
[41]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015:18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, 234--241.
[42]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015:18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, 234--241.
[43]
Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman. 2023. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 22500--22510.
[44]
Neelima Sailaja, Teresa Castle-Green, Paul Coulton, Michael Stead, Joseph Lindley, Lachlan Urquhart, and Dimitrios Paris Darzentas. 2023. UbiFix: Tackling Repairability Challenges in Smart Devices. In Adjunct Proceedings of the 2023 ACM International Joint Conference on Pervasive and Ubiquitous Computing & the 2023 ACM International Symposium on Wearable Computing. 802--806.
[45]
Osamu Saisho, Keiichiro Kashiwagi, Sakiko Kawai, Kazuki Iwahana, and Koki Mitani. 2023. Sandbox AI: We Don't Trust Each Other but Want to Create New Value Efficiently Through Collaboration Using Sensitive Data. In Adjunct Proceedings of the 2023 ACM International Joint Conference on Pervasive and Ubiquitous Computing & the 2023 ACM International Symposium on Wearable Computing. 68--72.
[46]
Nikolaos Sarafianos, Xiang Xu, and Ioannis A Kakadiaris. 2019. Adversarial representation learning for text-to-image matching. In Proceedings of the IEEE/CVF international conference on computer vision. 5814--5824.
[47]
Sebastian Schultheiß and Dirk Lewandowski. 2023. Misplaced trust? The relationship between trust, ability to identify commercially influenced results and search engine preference. Journal of Information Science 49, 3 (2023), 609--623.
[48]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[49]
Jiaming Song, Chenlin Meng, and Stefano Ermon. 2020. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502 (2020).
[50]
Nitish Srivastava and Russ R Salakhutdinov. 2012. Multimodal learning with deep boltzmann machines. Advances in neural information processing systems 25 (2012).
[51]
Yuqian Sun, Xingyu Li, Jun Peng, and Ze Gao. 2023. Inspire creativity with ORIBA: Transform Artists' Original Characters into Chatbots through Large Language Model. In Adjunct Proceedings of the 2023 ACM International Joint Conference on Pervasive and Ubiquitous Computing & the 2023 ACM International Symposium on Wearable Computing. 78--82.
[52]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).
[53]
Dakuo Wang, Elizabeth Churchill, Pattie Maes, Xiangmin Fan, Ben Shneiderman, Yuanchun Shi, and Qianying Wang. 2020. From human-human collaboration to Human-AI collaboration: Designing AI systems that can work together with people. In Extended abstracts of the 2020 CHI conference on human factors in computing systems. 1--6.
[54]
Nannan Wang, Xinbo Gao, Leiyu Sun, and Jie Li. 2017. Bayesian face sketch synthesis. IEEE transactions on image processing 26, 3 (2017), 1264--1274.
[55]
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13, 4 (2004), 600--612.
[56]
Di Wu, Qinghua Guan, Zhe Fan, Hanhui Deng, and Tao Wu. 2022. Automl with parallel genetic algorithm for fast hyperparameters optimization in efficient iot time series prediction. IEEE Transactions on Industrial Informatics (2022).
[57]
Di Wu, Jinhui Ouyang, Ningyi Dai, Mingzhu Wu, Haodan Tan, Hanhui Deng, Yongmei Fan, Dakuo Wang, and Zhanpeng Jin. 2022. DeepBrain: Enabling Fine-Grained Brain-Robot Interaction through Human-Centered Learning of Coarse EEG Signals from Low-Cost Devices. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 3 (2022), 1--27.
[58]
Di Wu, He Xu, Zhongkai Jiang, Weiren Yu, Xuetao Wei, and Jiwu Lu. 2021. EdgeLSTM: Towards deep and sequential edge computing for IoT applications. IEEE/ACM Transactions on Networking 29, 4 (2021), 1895--1908.
[59]
Di Wu, Zhiwang Yu, Nan Ma, Jianan Jiang, Yuetian Wang, Guixiang Zhou, Hanhui Deng, and Yi Li. 2023. StyleMe: Towards Intelligent Fashion Generation with Designer Style. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1--16.
[60]
Saining Xie and Zhuowen Tu. 2015. Holistically-nested edge detection. In Proceedings of the IEEE international conference on computer vision. 1395--1403.
[61]
Lvmin Zhang, Anyi Rao, and Maneesh Agrawala. 2023. Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3836--3847.
[62]
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition. 586--595.
[63]
Yuxin Zhang, Fan Tang, Weiming Dong, Haibin Huang, Chongyang Ma, Tong-Yee Lee, and Changsheng Xu. 2023. A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive Learning. ACM Transactions on Graphics (2023).
[64]
Mingrui Zhu, Changcheng Liang, Nannan Wang, Xiaoyu Wang, Zhifeng Li, and Xinbo Gao. 2021. A Sketch-Transformer Network for Face Photo-Sketch Synthesis. In IJCAI. 1352--1358.

Cited By

View all
  • (2024)StyleWe: Towards Style Fusion in Generative Fashion Design with Efficient Federated AIProceedings of the ACM on Human-Computer Interaction10.1145/36870548:CSCW2(1-31)Online publication date: 8-Nov-2024

Index Terms

  1. HAIGEN: Towards Human-AI Collaboration for Facilitating Creativity and Style Generation in Fashion Design

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
      Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies  Volume 8, Issue 3
      August 2024
      1782 pages
      EISSN:2474-9567
      DOI:10.1145/3695755
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 09 September 2024
      Published in IMWUT Volume 8, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Generative Artificial Intelligence
      2. Human-AI Collaboration
      3. Personalized Fashion Design

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)321
      • Downloads (Last 6 weeks)142
      Reflects downloads up to 19 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)StyleWe: Towards Style Fusion in Generative Fashion Design with Efficient Federated AIProceedings of the ACM on Human-Computer Interaction10.1145/36870548:CSCW2(1-31)Online publication date: 8-Nov-2024

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media