Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Interpretable Partitioned Embedding for Intelligent Multi-item Fashion Outfit Composition

Published: 29 July 2019 Publication History

Abstract

Intelligent fashion outfit composition has become more popular in recent years. Some deep-learning-based approaches reveal competitive composition. However, the uninterpretable characteristic makes such a deep-learning-based approach fail to meet the businesses’, designers’, and consumers’ urges to comprehend the importance of different attributes in an outfit composition. To realize interpretable and intelligent multi-item fashion outfit compositions, we propose a partitioned embedding network to learn interpretable embeddings from clothing items. The network contains two vital components: attribute partition module and partition adversarial module. In the attribute partition module, multiple attribute labels are adopted to ensure that different parts of the overall embedding correspond to different attributes. In the partition adversarial module, adversarial operations are adopted to achieve the independence of different parts. With the interpretable and partitioned embedding, we then construct an outfit-composition graph and an attribute matching map. Extensive experiments demonstrate that (1) the partitioned embedding have unmingled parts that correspond to different attributes and (2) outfits recommended by our model are more desirable in comparison with the existing methods.

References

[1]
Kaori Abe, Teppei Suzuki, Shunya Ueta, Akio Nakamura, Yutaka Satoh, and Hirokatsu Kataoka. 2017. Changing fashion cultures. Retrieved from: arXiv preprint arXiv:1703.07920.
[2]
Lukas Bossard, Matthias Dantone, Christian Leistner, Christian Wengert, Till Quack, and Luc Van Gool. 2012. Apparel classification with style. In Proceedings of the Asian Conference on Computer Vision. Springer, 321--335.
[3]
Diane Bouchacourt, Ryota Tomioka, and Sebastian Nowozin. 2017. Multi-level variational autoencoder: Learning disentangled representations from grouped observations. Retrieved from: arXiv preprint arXiv:1705.08841.
[4]
Huizhong Chen, Andrew Gallagher, and Bernd Girod. 2012. Describing clothing by semantic attributes. In Proceedings of the European Conference on Computer Vision. 609--623.
[5]
Qiang Chen, Junshi Huang, Rogerio Feris, Lisa M. Brown, Jian Dong, and Shuicheng Yan. 2015. Deep domain adaptation for describing people based on fine-grained clothing attributes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5315--5324.
[6]
Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 2172--2180.
[7]
C. Y. Chiu, Y. C. Liou, and Amorntip Prayoonwong. 2016. Approximate asymmetric search for binary embedding codes. ACM Trans. Multimedia Comput. Commun. Appl. 13, 1 (2016).
[8]
Zunlei Feng, Xinchao Wang, Chenglong Ke, Anxiang Zeng, Dacheng Tao, and Mingli Song. 2018. Dual swap disentangling. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (2018), 5898--5908.
[9]
Zunlei Feng, Wolong Yuan, Chunli Fu, Jie Lei, and Mingli Song. 2018. Finding intrinsic color themes in images with human visual perception. Neurocomputing 273 (2018), 395--402.
[10]
Jianlong Fu, Jinqiao Wang, Zechao Li, Min Xu, and Hanqing Lu. 2012. Efficient clothing retrieval with semantic-preserving visual phrases. In Proceedings of the Asian Conference on Computer Vision. Springer, 420--431.
[11]
Naama Hadad, Lior Wolf, and Moni Shahar. 2017. Two-step disentanglement for financial data. arXiv: Learning (2017). Retrieved from https://arxiv.org/abs/1709.00199.
[12]
M. Hadi Kiapour, Xufeng Han, Svetlana Lazebnik, Alexander C. Berg, and Tamara L. Berg. 2015. Where to buy it: Matching street clothing photos in online shops. In Proceedings of the IEEE International Conference on Computer Vision. 3343--3351.
[13]
Xintong Han, Zuxuan Wu, Yugang Jiang, and Larry S. Davis. 2017. Learning fashion compatibility with bidirectional LSTMs. In Proceedings of the ACM International Conference on Multimedia. 1078--1086.
[14]
Ruining He and Julian McAuley. 2016. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In Proceedings of the 25th International Conference on World Wide Web. 507--517.
[15]
Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. 2016. beta-VAE: Learning basic visual concepts with a constrained variational framework. In Proceedings of the International Conference on Learning Representations.
[16]
Weilin Hsiao and Kristen Grauman. 2018. Creating capsule wardrobes from fashion images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7161--7170.
[17]
Qiyang Hu, Attila Szabo, Tiziano Portenier, Paolo Favaro, and Matthias Zwicker. 2018. Disentangling factors of variation by mixing them. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3399--3407.
[18]
Yang Hu, Xi Yi, and Larry S. Davis. 2015. Collaborative fashion recommendation: A functional tensor factorization approach. ACM International Conference on Multimedia. 129--138.
[19]
Junshi Huang, Rogerio S. Feris, Qiang Chen, and Shuicheng Yan. 2015. Cross-domain image retrieval with a dual attribute-aware ranking network. In Proceedings of the IEEE International Conference on Computer Vision. 1062--1070.
[20]
Tomoharu Iwata, Shinji Wanatabe, and Hiroshi Sawada. 2011. Fashion coordinates recommender system using photographs from fashion magazines. In Proceedings of the International Joint Conference on Artificial Intelligence, Vol. 22. 2262.
[21]
Vignesh Jagadeesh, Robinson Piramuthu, Anurag Bhardwaj, Wei Di, and Neel Sundaresan. 2014. Large scale visual recommendations from street fashion images. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1925--1934.
[22]
Shuhui Jiang, Yue Wu, and Yun Fu. 2018. Deep bidirectional cross-triplet embedding for online clothing shopping. ACM Trans. Multimedia Comput. Commun. Appl. 14, 1 (2018), 1--22.
[23]
M. Hadi Kiapour, Kota Yamaguchi, Alexander C. Berg, and Tamara L. Berg. 2014. Hipster wars: Discovering elements of fashion styles. In Proceedings of the European Conference on Computer Vision. Springer, 472--488.
[24]
Diederik P. Kingma and Max Welling. 2014. Auto-encoding variational Bayes. In Proceedings of the International Conference on Learning Representations.
[25]
Tejas D. Kulkarni, William F. Whitney, Pushmeet Kohli, and Josh Tenenbaum. 2015. Deep convolutional inverse graphics network. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 2539--2547.
[26]
Yuncheng Li, Liangliang Cao, Jiang Zhu, and Jiebo Luo. 2017. Mining fashion outfit composition using an end-to-end deep learning approach on set data. IEEE Trans. Multimedia 19, 8 (2017), 1946--1955.
[27]
Luoqi Liu, Hui Xu, Si Liu, Junliang Xing, Xi Zhou, and Shuicheng Yan. 2013. “Wow! you are so beautiful today!” ACM Trans. Multimedia Comput. Commun. Appl. 11, 1s (2013), 437--438.
[28]
Qiang Liu, Shu Wu, and Liang Wang. 2017. DeepStyle: Learning user preferences for visual recommendation. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 841--844.
[29]
Si Liu, Tam V. Nguyen, Jiashi Feng, Meng Wang, and Shuicheng Yan. 2012. Hi, magic closet, tell me what to wear! In Proceedings of the ACM International Conference on Multimedia.
[30]
Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, and Xiaoou Tang. 2016. Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1096--1104.
[31]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. J. Machine Learn. Res. 9, Nov (2008), 2579--2605.
[32]
Kevin Matzen, Kavita Bala, and Noah Snavely. 2017. StreetStyle: Exploring world-wide clothing styles from millions of photos. Retrieved from: arXiv preprint arXiv:1706.01869.
[33]
Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel. 2015. Image-based recommendations on styles and substitutes. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 43--52.
[34]
Jan Morovic. 1998. To develop a universal gamut mapping algorithm. Retrieved from: http://hdl.handle.net/10545/200029.
[35]
Jose Oramas and Tinne Tuytelaars. 2016. Modeling visual compatibility through hierarchical mid-level elements. arXiv preprint arXiv:1604.00036.
[36]
Guim Perarnau, Joost van de Weijer, Bogdan Raducanu, and Jose M. Álvarez. 2016. Invertible conditional GANs for image editing. Retrieved from: arXiv preprint arXiv:1611.06355.
[37]
Jitao Sang and Changsheng Xu. 2013. Social influence analysis and application on multimedia sharing websites. ACM Trans. Multimedia Comput. Commun. Appl. 9, 1 (2013), 53.
[38]
Jürgen Schmidhuber. 1992. Learning factorial codes by predictability minimization. Neural Computat. 4, 6 (1992), 863--879.
[39]
N. Siddharth, Brooks Paige, Alban Desmaison, Jan-Willem van de Meent, Frank Wood, Noah D. Goodman, Pushmeet Kohli, and Philip H. S. Torr. 2017. Learning disentangled representations in deep generative models. Neural Information Processing Systems (2017), 5927--5937.
[40]
Edgar Simo-Serra, Sanja Fidler, Francesc Moreno-Noguer, and Raquel Urtasun. 2015. Neuroaesthetics in fashion: Modeling the perception of fashionability. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 869--877.
[41]
Edgar Simo-Serra and Hiroshi Ishikawa. 2016. Fashion style in 128 floats: Joint ranking and classification using weak data for feature extraction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 298--307.
[42]
Bart Thomee, Ioannis Arapakis, and David A. Shamma. 2016. Finding social points of interest from georeferenced and oriented online photographs. ACM Trans. Multimedia Comput. Commun. Appl. 12, 2 (2016), 1--23.
[43]
Louis L. Thurstone. 1927. A law of comparative judgment. Psych. Rev. 34, 4 (1927), 273.
[44]
Andreas Veit, Serge Belongie, and Theofanis Karaletsos. 2017. Conditional similarity networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[45]
Andreas Veit, Balazs Kovacs, Sean Bell, Julian McAuley, Kavita Bala, and Serge Belongie. 2015. Learning visual clothing style with heterogeneous dyadic co-occurrences. In Proceedings of the IEEE International Conference on Computer Vision. 4642--4650.
[46]
Sirion Vittayakorn, Kota Yamaguchi, Alexander C. Berg, and Tamara L. Berg. 2015. Runway to realway: Visual analysis of fashion. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision. 951--958.
[47]
Chaoyue Wang, Chaohui Wang, Chang Xu, and Dacheng Tao. 2017. Tag disentangled generative adversarial network for object image re-rendering. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2901--2907.
[48]
Xiaolong Wang and Abhinav Gupta. 2016. Generative image modeling using style and structure adversarial networks. In Proceedings of the European Conference on Computer Vision. Springer, 318--335.
[49]
Xianwang Wang and Tong Zhang. 2011. Clothes search in consumer photos via color matching and attribute learning. In Proceedings of the 19th ACM International Conference on Multimedia. ACM, 1353--1356.
[50]
Jiqing Wen, James She, Xiaopeng Li, and Hui Mao. 2018. Visual background recommendation for dance performances using deep matrix factorization. ACM Trans. Multimedia Comput. Commun. Appl. 14, 1 (2018), 1--19.
[51]
Qiong Wu and Pierre Boulanger. 2016. Enhanced reweighted MRFs for efficient fashion image parsing. ACM Trans. Multimedia Comput. Commun. Appl. 12, 3 (2016), 1--16.
[52]
Kota Yamaguchi, M. Hadi Kiapour, Luis E. Ortiz, and Tamara L. Berg. 2012. Parsing clothing in fashion photographs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3570--3577.
[53]
Kota Yamaguchi, Takayuki Okatani, Kyoko Sudo, Kazuhiko Murasaki, and Yukinobu Taniguchi. 2015. Mix and match: Joint model for clothing and attribute recognition. In Proceedings of the British Machine Vision Conference. 51--1.
[54]
Hanwang Zhang, Zheng Jun Zha, Yang Yang, Shuicheng Yan, Yue Gao, and Tat Seng Chua. 2014. Attribute-augmented semantic hierarchy: Towards a unified framework for content-based image retrieval. ACM Trans. Multimedia Comput. Commun. Appl. 11, 1s (2014), 1--21.
[55]
Qianni Zhang and Ebroul Izquierdo. 2013. Multifeature analysis and semantic context learning for image classification. ACM Trans. Multimedia Comput. Commun. Appl. 9, 2 (2013), 1--20.
[56]
Zhengzhong Zhou, Yifei Xu, Jingjin Zhou, and Liqing Zhang. 2016. Interactive image search for clothing recommendation. In Proceedings of the ACM on Multimedia Conference. ACM, 754--756.

Cited By

View all
  • (2024)Toward Attribute-Controlled Fashion Image CaptioningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/367100020:9(1-18)Online publication date: 5-Jun-2024
  • (2024)Arbitrary Virtual Try-on Network: Characteristics Preservation and Tradeoff between Body and ClothingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363642620:5(1-23)Online publication date: 11-Jan-2024
  • (2024)Lost Your Style? Navigating with Semantic-Level Approach for Text-to-Outfit Retrieval2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00788(8051-8060)Online publication date: 3-Jan-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 15, Issue 2s
Special Section on Cross-Media Analysis for Visual Question Answering, Special Section on Big Data, Machine Learning and AI Technologies for Art and Design and Special Section on MMSys/NOSSDAV 2018
April 2019
381 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/3343360
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 July 2019
Accepted: 01 April 2019
Revised: 01 February 2019
Received: 01 August 2018
Published in TOMM Volume 15, Issue 2s

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Outfit composition
  2. adversarial
  3. embedding
  4. interpretable

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Program of International Science and Technology Cooperation
  • Key Research and Development Program of Zhejiang Province
  • National Key Research and Development Program
  • National Natural Science Foundation of China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)1
Reflects downloads up to 29 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Toward Attribute-Controlled Fashion Image CaptioningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/367100020:9(1-18)Online publication date: 5-Jun-2024
  • (2024)Arbitrary Virtual Try-on Network: Characteristics Preservation and Tradeoff between Body and ClothingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363642620:5(1-23)Online publication date: 11-Jan-2024
  • (2024)Lost Your Style? Navigating with Semantic-Level Approach for Text-to-Outfit Retrieval2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00788(8051-8060)Online publication date: 3-Jan-2024
  • (2023)Computational Technologies for Fashion Recommendation: A SurveyACM Computing Surveys10.1145/362710056:5(1-45)Online publication date: 25-Nov-2023
  • (2023)Explaining Cross-domain Recognition with Interpretable Deep ClassifierACM Transactions on Multimedia Computing, Communications, and Applications10.1145/362339920:3(1-21)Online publication date: 23-Oct-2023
  • (2023)Self-Adaptive Clothing Mapping Based Virtual Try-onACM Transactions on Multimedia Computing, Communications, and Applications10.1145/361345320:3(1-26)Online publication date: 23-Oct-2023
  • (2023)COutfitGAN: Learning to Synthesize Compatible Outfits Supervised by Silhouette Masks and Fashion StylesIEEE Transactions on Multimedia10.1109/TMM.2022.318589425(4986-5001)Online publication date: 1-Jan-2023
  • (2023)Fashion-Specific Ambiguous Expression Interpretation with Partial Visual-Semantic Embedding2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW59228.2023.00353(3497-3502)Online publication date: Jun-2023
  • (2023)Partial visual-semantic embeddingKnowledge-Based Systems10.1016/j.knosys.2023.110791277:COnline publication date: 9-Oct-2023
  • (2022)Neural stylistExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.117333203:COnline publication date: 1-Oct-2022
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media