Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3581783.3612139acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Multimodal Color Recommendation in Vector Graphic Documents

Published: 27 October 2023 Publication History

Abstract

Color selection plays a critical role in graphic document design and requires sufficient consideration of various contexts. However, recommending appropriate colors which harmonize with the other colors and textual contexts in documents is a challenging task, even for experienced designers. In this study, we propose a multimodal masked color model that integrates both color and textual contexts to provide text-aware color recommendation for graphic documents. Our proposed model comprises self-attention networks to capture the relationships between colors in multiple palettes, and cross-attention networks that incorporate both color and CLIP-based text representations. Our proposed method primarily focuses on color palette completion, which recommends colors based on the given colors and text. Additionally, it is applicable for another color recommendation task, full palette generation, which generates a complete color palette corresponding to the given text. Experimental results demonstrate that our proposed approach surpasses previous color palette completion methods on accuracy, color distribution, and user experience, as well as full palette generation methods concerning color diversity and similarity to the ground truth palettes.

Supplemental Material

MP4 File
This is the ACMMM23 presentation video of paper 1940 titled "Multimodal color recommendation in vector graphic documents", presented by Qianru Qiu from CyberAgent. This work proposed a multimodal masked color model to integrate color and textual contexts within a graphic document using lightness ordered color representation and CLIP-based text representation. This proposal is applicable for both two color recommendation tasks, as color palette completion and full palette generation. The video talks about three parts: introduction, methodology, and experiments. For more details, please scan the QR code in the video to visit our project page.

References

[1]
Accessed 2023-4-30. Adobe Color. https://color.adobe.com/.
[2]
Accessed 2023-4-30. COLOURLovers. https://www.colourlovers.com/.
[3]
Hyojin Bahng, Seungjoo Yoo, Wonwoong Cho, David Keetae Park, Ziming Wu, Xiaojuan Ma, and Jaegul Choo. 2018. Coloring with words: Guiding image colorization through text-based palette generation. In Proceedings of the european conference on computer vision (eccv). 431--447.
[4]
Huiwen Chang, Ohad Fried, Yiming Liu, Stephen DiVerdi, and Adam Finkelstein. 2015. Palette-based photo recoloring. ACM Trans. Graph., Vol. 34, 4 (2015), 139--1.
[5]
Hideaki Chijiiwa. 1987. Color harmony: a guide to creative color combinations. Vol. 1. Rockport Pub.
[6]
Junho Cho, Sangdoo Yun, Kyoung Mu Lee, and Jin Young Choi. 2017. Palettenet: Image recolorization with given color palette. In Proceedings of the ieee conference on computer vision and pattern recognition workshops. 62--70.
[7]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[8]
Ali Jahanian, Jerry Liu, Qian Lin, Daniel Tretter, Eamonn O'Brien-Strain, Seungyon Claire Lee, Nic Lyons, and Jan Allebach. 2013. Recommendation system for automatic design of magazine covers. In Proceedings of the 2013 international conference on Intelligent user interfaces. 95--106.
[9]
Kazuya Kawakami, Chris Dyer, Bryan R Routledge, and Noah A Smith. 2016. Character Sequence Models for Colorful Words. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 1949--1954.
[10]
Kotaro Kikuchi, Naoto Inoue, Mayu Otani, Edgar Simo-Serra, and Kota Yamaguchi. 2023. Generative Colorization of Structured Mobile Web Pages. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 3650--3659.
[11]
Eunseo Kim, Jeongmin Hong, Hyuna Lee, and Minsam Ko. 2022. Colorbo: Envisioned Mandala Coloringthrough Human-AI Collaboration. In 27th International Conference on Intelligent User Interfaces. 15--26.
[12]
Suzi Kim and Sunghee Choi. 2021. Dynamic closest color warping to sort and compare palettes. ACM Transactions on Graphics (TOG), Vol. 40, 4 (2021), 1--15.
[13]
Naoki Kita and Kazunori Miyata. 2016. Aesthetic rating and color suggestion for color palettes. In Computer Graphics Forum, Vol. 35. Wiley Online Library, 127--136.
[14]
Paridhi Maheshwari, Nihal Jain, Praneetha Vaddamanu, Dhananjay Raut, Shraiysh Vaishay, and Vishwa Vinay. 2021. Generating Compositional Color Representations from Text. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 1222--1231.
[15]
Will Monroe, Robert XD Hawkins, Noah D Goodman, and Christopher Potts. 2017. Colors in context: A pragmatic neural model for grounded language understanding. Transactions of the Association for Computational Linguistics, Vol. 5 (2017), 325--338.
[16]
Peter O'Donovan, Aseem Agarwala, and Aaron Hertzmann. 2011. Color compatibility from large datasets. In ACM SIGGRAPH 2011 papers. 1--12.
[17]
Qianru Qiu, Mayu Otani, and Yuki Iwazaki. 2022. An Intelligent Color Recommendation Tool for Landing Page Design. In 27th International Conference on Intelligent User Interfaces. 26--29.
[18]
Qianru Qiu, Xueting Wang, Mayu Otani, and Yuki Iwazaki. 2023. Color Recommendation for Vector Graphic Documents based on Multi-Palette Representation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 3621--3629.
[19]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748--8763.
[20]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017).
[21]
Alexander Wettig, Tianyu Gao, Zexuan Zhong, and Danqi Chen. 2022. Should you mask 15% in masked language modeling? arXiv preprint arXiv:2202.08005 (2022).
[22]
Kota Yamaguchi. 2021. Canvasvae: learning to generate vector graphic documents. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5481--5489.
[23]
Xuyong Yang, Tao Mei, Ying-Qing Xu, Yong Rui, and Shipeng Li. 2016. Automatic generation of visual-textual presentation layout. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Vol. 12, 2 (2016), 1--22.
[24]
Lin-Ping Yuan, Ziqi Zhou, Jian Zhao, Yiqiu Guo, Fan Du, and Huamin Qu. 2021. Infocolorizer: Interactive recommendation of color palettes for infographics. IEEE Transactions on Visualization and Computer Graphics, Vol. 28, 12 (2021), 4252--4266.
[25]
André Zaccarin and Bede Liu. 1993. A novel approach for coding color quantized images. IEEE Transactions on Image Processing, Vol. 2, 4 (1993), 442--453.
[26]
Qing Zhang, Chunxia Xiao, Hanqiu Sun, and Feng Tang. 2017. Palette-based image recoloring using color decomposition optimization. IEEE Transactions on Image Processing, Vol. 26, 4 (2017), 1952--1964.

Cited By

View all
  • (2024)ERL-MR: Harnessing the Power of Euler Feature Representations for Balanced Multi-modal LearningProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681215(4591-4600)Online publication date: 28-Oct-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '23: Proceedings of the 31st ACM International Conference on Multimedia
October 2023
9913 pages
ISBN:9798400701085
DOI:10.1145/3581783
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. attention network
  2. color recommendation
  3. multimodal learning
  4. palette generation

Qualifiers

  • Research-article

Conference

MM '23
Sponsor:
MM '23: The 31st ACM International Conference on Multimedia
October 29 - November 3, 2023
Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)94
  • Downloads (Last 6 weeks)8
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)ERL-MR: Harnessing the Power of Euler Feature Representations for Balanced Multi-modal LearningProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681215(4591-4600)Online publication date: 28-Oct-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media