research-article

HAIGEN: Towards Human-AI Collaboration for Facilitating Creativity and Style Generation in Fashion Design

Authors:

Tangquan QiAuthors Info & Claims

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Volume 8, Issue 3

Article No.: 107, Pages 1 - 27

https://doi.org/10.1145/3678518

Published: 09 September 2024 Publication History

Abstract

The process of fashion design usually involves sketching, refining, and coloring, with designers drawing inspiration from various images to fuel their creative endeavors. However, conventional image search methods often yield irrelevant results, impeding the design process. Moreover, creating and coloring sketches can be time-consuming and demanding, acting as a bottleneck in the design workflow. In this work, we introduce HAIGEN (Human-AI Collaboration for GENeration), an efficient fashion design system for Human-AI collaboration developed to aid designers. Specifically, HAIGEN consists of four modules. T2IM, located in the cloud, generates reference inspiration images directly from text prompts. With three other modules situated locally, the I2SM batch generates the image material library into a certain designer-style sketch material library. The SRM recommends similar sketches in the generated library to designers for further refinement, and the STM colors the refined sketch according to the styles of inspiration images. Through our system, any designer can perform local personalized fine-tuning and leverage the powerful generation capabilities of large models in the cloud, streamlining the entire design development process. Given that our approach integrates both cloud and local model deployment schemes, it effectively safeguards design privacy by avoiding the need to upload personalized data from local designers. We validated the effectiveness of each module through extensive qualitative and quantitative experiments. User surveys also confirmed that HAIGEN offers significant advantages in design efficiency, positioning it as a new generation of aid-tool for designers.

Supplemental Material

MP4 File - Hunan University

Demo Video

Download
11.80 MB

References

[1]

Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Jorma Laaksonen, and Michael Felsberg. 2022. Doodleformer: Creative sketch drawing with transformers. In Computer Vision-ECCV 2022:17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XVII. Springer, 338--355.

[2]

John Canny. 1986. A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence 6 (1986), 679--698.

Digital Library

[3]

Chun-Fu Richard Chen, Quanfu Fan, and Rameswar Panda. 2021. Crossvit: Cross-attention multi-scale vision transformer for image classification. In Proceedings of the IEEE/CVF international conference on computer vision. 357--366.

[4]

Sungjae Cho, Yoonsu Kim, Jaewoong Jang, and Inseok Hwang. 2023. AI-to-Human Actuation: Boosting Unmodified AI's Robustness by Proactively Inducing Favorable Human Sensing Conditions. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 7, 1 (2023), 1--32.

Digital Library

[5]

Hanhui Deng, Jianan Jiang, Zhiwang Yu, Jinhui Ouyang, and Di Wu. 2024. CrossGAI: A Cross-Device Generative AI Framework for Collaborative Fashion Design. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 8, 1 (2024), 1--27.

Digital Library

[6]

Zheng Ding, Xuaner Zhang, Zhihao Xia, Lars Jebe, Zhuowen Tu, and Xiuming Zhang. 2023. DiffusionRig: Learning Personalized Priors for Facial Appearance Editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12736--12746.

[7]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).

[8]

Mayara Costa Figueiredo, Elizabeth Ankrah, Jacquelyn E Powell, Daniel A Epstein, and Yunan Chen. 2024. Powered by AI: Examining How AI Descriptions Influence Perceptions of Fertility Tracking Applications. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 7, 4 (2024), 1--24.

Digital Library

[9]

Songwei Ge, Vedanuj Goswami, C. Lawrence Zitnick, and Devi Parikh. 2020. Creative Sketch Generation. arXiv:2011.10039 [cs.CV]

[10]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. Advances in Neural Information Processing Systems 27 (2014).

[11]

David Ha and Douglas Eck. 2017. A neural representation of sketch drawings. arXiv preprint arXiv:1704.03477 (2017).

[12]

Kai Han, An Xiao, Enhua Wu, Jianyuan Guo, Chunjing Xu, and Yunhe Wang. 2021. Transformer in transformer. In NeurIPS.

[13]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[14]

Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems 30 (2017).

[15]

Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in neural information processing systems 33 (2020), 6840--6851.

[16]

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2021. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021).

[17]

Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE international conference on computer vision. 1501--1510.

[18]

Markus Ikeda, Fabian Widmoser, Gernot Stübl, Setareh Zafari, Andreas Sackl, and Andreas Pichler. 2023. An Interactive Workplace for Improving Human Robot Collaboration: Sketch Workpiece Interface for Fast Teaching (SWIFT). In Adjunct Proceedings of the 2023 ACM International Joint Conference on Pervasive and Ubiquitous Computing & the 2023 ACM International Symposium on Wearable Computing. 191--194.

Digital Library

[19]

Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning. pmlr, 448--456.

[20]

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1125--1134.

[21]

Jianan Jiang, Xinglin Li, Weiren Yu, and Di Wu. 2024. HAIFIT: Human-Centered AI for Fashion Image Translation. arXiv preprint arXiv:2403.08651 (2024).

[22]

Jianan Jiang, Di Wu, Zhilin Jiang, and Weiren Yu. 2024. Simple Yet Efficient: Towards Self-Supervised FG-SBIR with Unified Sample Feature Alignment. arXiv preprint arXiv:2406.11551 (2024).

[23]

Haojian Jin, Boyuan Guo, Rituparna Roychoudhury, Yaxing Yao, Swarun Kumar, Yuvraj Agarwal, and Jason I Hong. 2022. Exploring the needs of users for supporting privacy-protective behaviors in smart homes. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1--19.

Digital Library

[24]

Ben Jonson. 2005. Design ideation: the conceptual sketch in the digital age. Design studies 26, 6 (2005), 613--624.

[25]

Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8110--8119.

[26]

Junho Kim, Minjae Kim, Hyeonwoo Kang, and Hee Kwang Lee. 2020. U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation. (2020).

[27]

Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).

[28]

Nupur Kumari, Bingliang Zhang, Richard Zhang, Eli Shechtman, and Jun-Yan Zhu. 2023. Multi-concept customization of text-to-image diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1931--1941.

[29]

Yao Li, Xianggang Yu, Xiaoguang Han, Nianjuan Jiang, Kui Jia, and Jiangbo Lu. 2020. A deep learning based interactive sketching system for fashion images design. arXiv preprint arXiv:2010.04413 (2020).

[30]

Bingchen Liu, Yizhe Zhu, Kunpeng Song, and Ahmed Elgammal. 2021. Self-supervised sketch-to-image synthesis. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 2073--2081.

[31]

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).

[32]

Zhijian Liu, Zhanghao Wu, Chuang Gan, Ligeng Zhu, and Song Han. 2020. DataMix: Efficient Privacy-Preserving Edge-Cloud Inference. In European Conference on Computer Vision. 578--595.

[33]

Midjourney. 2022. Midjourney. https://www.midjourney.com/home [Online; accessed August-2023].

[34]

Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014).

[35]

Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. 2018. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018).

[36]

M Prajwal, Ayush Raj, Sougata Sen, Snehanshu Saha, and Surjya Ghosh. 2023. Towards Efficient Emotion Self-report Collection Using Human-AI Collaboration: A Case Study on Smartphone Keyboard Interaction. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 7, 2 (2023), 1--23.

Digital Library

[37]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748--8763.

[38]

Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. 2021. Zero-shot text-to-image generation. In International Conference on Machine Learning. PMLR, 8821--8831.

[39]

Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. 2016. Generative adversarial text to image synthesis. In International conference on machine learning. PMLR, 1060--1069.

[40]

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10684--10695.

[41]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015:18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, 234--241.

[42]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015:18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, 234--241.

[43]

Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman. 2023. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 22500--22510.

[44]

Neelima Sailaja, Teresa Castle-Green, Paul Coulton, Michael Stead, Joseph Lindley, Lachlan Urquhart, and Dimitrios Paris Darzentas. 2023. UbiFix: Tackling Repairability Challenges in Smart Devices. In Adjunct Proceedings of the 2023 ACM International Joint Conference on Pervasive and Ubiquitous Computing & the 2023 ACM International Symposium on Wearable Computing. 802--806.

[45]

Osamu Saisho, Keiichiro Kashiwagi, Sakiko Kawai, Kazuki Iwahana, and Koki Mitani. 2023. Sandbox AI: We Don't Trust Each Other but Want to Create New Value Efficiently Through Collaboration Using Sensitive Data. In Adjunct Proceedings of the 2023 ACM International Joint Conference on Pervasive and Ubiquitous Computing & the 2023 ACM International Symposium on Wearable Computing. 68--72.

Digital Library

[46]

Nikolaos Sarafianos, Xiang Xu, and Ioannis A Kakadiaris. 2019. Adversarial representation learning for text-to-image matching. In Proceedings of the IEEE/CVF international conference on computer vision. 5814--5824.

[47]

Sebastian Schultheiß and Dirk Lewandowski. 2023. Misplaced trust? The relationship between trust, ability to identify commercially influenced results and search engine preference. Journal of Information Science 49, 3 (2023), 609--623.

Digital Library

[48]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[49]

Jiaming Song, Chenlin Meng, and Stefano Ermon. 2020. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502 (2020).

[50]

Nitish Srivastava and Russ R Salakhutdinov. 2012. Multimodal learning with deep boltzmann machines. Advances in neural information processing systems 25 (2012).

[51]

Yuqian Sun, Xingyu Li, Jun Peng, and Ze Gao. 2023. Inspire creativity with ORIBA: Transform Artists' Original Characters into Chatbots through Large Language Model. In Adjunct Proceedings of the 2023 ACM International Joint Conference on Pervasive and Ubiquitous Computing & the 2023 ACM International Symposium on Wearable Computing. 78--82.

Digital Library

[52]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).

[53]

Dakuo Wang, Elizabeth Churchill, Pattie Maes, Xiangmin Fan, Ben Shneiderman, Yuanchun Shi, and Qianying Wang. 2020. From human-human collaboration to Human-AI collaboration: Designing AI systems that can work together with people. In Extended abstracts of the 2020 CHI conference on human factors in computing systems. 1--6.

[54]

Nannan Wang, Xinbo Gao, Leiyu Sun, and Jie Li. 2017. Bayesian face sketch synthesis. IEEE transactions on image processing 26, 3 (2017), 1264--1274.

Digital Library

[55]

Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13, 4 (2004), 600--612.

Digital Library

[56]

Di Wu, Qinghua Guan, Zhe Fan, Hanhui Deng, and Tao Wu. 2022. Automl with parallel genetic algorithm for fast hyperparameters optimization in efficient iot time series prediction. IEEE Transactions on Industrial Informatics (2022).

[57]

Di Wu, Jinhui Ouyang, Ningyi Dai, Mingzhu Wu, Haodan Tan, Hanhui Deng, Yongmei Fan, Dakuo Wang, and Zhanpeng Jin. 2022. DeepBrain: Enabling Fine-Grained Brain-Robot Interaction through Human-Centered Learning of Coarse EEG Signals from Low-Cost Devices. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 3 (2022), 1--27.

Digital Library

[58]

Di Wu, He Xu, Zhongkai Jiang, Weiren Yu, Xuetao Wei, and Jiwu Lu. 2021. EdgeLSTM: Towards deep and sequential edge computing for IoT applications. IEEE/ACM Transactions on Networking 29, 4 (2021), 1895--1908.

Digital Library

[59]

Di Wu, Zhiwang Yu, Nan Ma, Jianan Jiang, Yuetian Wang, Guixiang Zhou, Hanhui Deng, and Yi Li. 2023. StyleMe: Towards Intelligent Fashion Generation with Designer Style. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1--16.

Digital Library

[60]

Saining Xie and Zhuowen Tu. 2015. Holistically-nested edge detection. In Proceedings of the IEEE international conference on computer vision. 1395--1403.

Digital Library

[61]

Lvmin Zhang, Anyi Rao, and Maneesh Agrawala. 2023. Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3836--3847.

[62]

Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition. 586--595.

[63]

Yuxin Zhang, Fan Tang, Weiming Dong, Haibin Huang, Chongyang Ma, Tong-Yee Lee, and Changsheng Xu. 2023. A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive Learning. ACM Transactions on Graphics (2023).

[64]

Mingrui Zhu, Changcheng Liang, Nannan Wang, Xiaoyu Wang, Zhifeng Li, and Xinbo Gao. 2021. A Sketch-Transformer Network for Face Photo-Sketch Synthesis. In IJCAI. 1352--1358.

Cited By

Wu DWu MLi YJiang JLi XDeng HLiu CLi Y(2024)StyleWe: Towards Style Fusion in Generative Fashion Design with Efficient Federated AIProceedings of the ACM on Human-Computer Interaction10.1145/36870548:CSCW2(1-31)Online publication date: 8-Nov-2024
https://dl.acm.org/doi/10.1145/3687054

Index Terms

HAIGEN: Towards Human-AI Collaboration for Facilitating Creativity and Style Generation in Fashion Design
1. Applied computing
  1. Arts and humanities
2. Computing methodologies
  1. Artificial intelligence

Recommendations

StyleMe: Towards Intelligent Fashion Generation with Designer Style
CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems

Hand-drawn sketches and sketch colourization are the most laborious but necessary steps for fashion designers to design exquisite clothes, especially when the fashion design requires distinctive and personal characteristics from designer style. This ...
From Human-Human Collaboration to Human-AI Collaboration: Designing AI Systems That Can Work Together with People
CHI EA '20: Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems

Artificial Intelligent (AI) and Machine Learning (ML) algorithms are coming out of research labs into the real-world applications, and recent research has focused a lot on Human-AI Interaction (HAI) and Explainable AI (XAI). However, Interaction is not ...
FashionQ: An AI-Driven Creativity Support Tool for Facilitating Ideation in Fashion Design
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

Recent research on creativity support tools (CST) adopts artificial intelligence (AI) that leverages big data and computational capabilities to facilitate creative work. Our work aims to articulate the role of AI in supporting creativity with a case ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies Volume 8, Issue 3

August 2024

1782 pages

EISSN:2474-9567

DOI:10.1145/3695755

Issue’s Table of Contents

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 September 2024

Published in IMWUT Volume 8, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

National Natural Science Foundation of China
Key R&D Program of Hunan Province

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
321
Total Downloads

Downloads (Last 12 months)321
Downloads (Last 6 weeks)142

Reflects downloads up to 19 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wu DWu MLi YJiang JLi XDeng HLiu CLi Y(2024)StyleWe: Towards Style Fusion in Generative Fashion Design with Efficient Federated AIProceedings of the ACM on Human-Computer Interaction10.1145/36870548:CSCW2(1-31)Online publication date: 8-Nov-2024
https://dl.acm.org/doi/10.1145/3687054

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents