Graphic designers often get inspiration through the recombination of references. Our formative study (N=6) reveals that graphic designers focus on conceptual keywords during this process, and want support for discovering the keywords, expanding them, and exploring diverse recombination options of them, while still having room for designers’ creativity. We propose CreativeConnect, a system with generative AI pipelines that helps users discover useful elements from the reference image using keywords, recommends relevant keywords, generates diverse recombination options with user-selected keywords, and shows recombinations as sketches with text descriptions. Our user study (N=16) showed that CreativeConnect helped users discover keywords from the reference and generate multiple ideas based on them, ultimately helping users produce more design ideas with higher self-reported creativity, compared to the baseline system without generative pipelines. While CreativeConnect was shown effective in ideation, we discussed how CreativeConnect can be extended to support other types of tasks in creativity support.
Figure 1:
1 Introduction
References play a crucial role in creative thinking, such as graphic design, serving as valuable sources to both grasp the landscape of the existing ideas and ignite novel ones [31, 52, 62, 73]. They offer diverse visual, conceptual, and functional stimuli, allowing individuals to explore various creative directions and draw lessons from established successful examples [3]. One effective method to generate new ideas with references is making a combination of existing examples, which is often called combinatorial creativity [7, 9, 79, 87]. In practice, this is often done through reference recombination, which is the process of extracting the elements or aspects from multiple references, considering connections between them [31], and blending those to gain novel design ideas [2].
However, each step of recombination requires significant effort from designers. To discover sources for recombination, designers need to dissect the references into individual elements and analyze them to determine which combinations of elements are worth mixing. Additionally, they must engage in exploratory efforts by drawing multiple sketches to find effective methods of blending those elements into a new design idea. This takes a long time and multiple iterations, especially for those less experienced in the design process, as they have difficulty identifying various factors from references and integrating references from disparate domains compared to professionals [4].
Previous research has provided support for these individual steps. Several approaches have been proposed [35, 36, 39] to decompose the references or show connections between them, aiding users in identifying the sources for recombination. However, these approaches do not guide how to incorporate extracted elements into a design. Also, many approaches have attempted to help users blend different concepts or images into a novel one [12, 13, 84, 91]. However, these approaches primarily emphasize generating precise combinations that effectively incorporate all elements harmoniously rather than aiming to produce diverse combinations for creative exploratory purposes. Another thread of research focuses on searching by genetic recombination [14, 45, 82, 88]. Still, these techniques focused on widening the range of the design exploration rather than offering inspiration on effectively combining specific design elements.
Through a formative study with six early-stage graphic designers and design students, we aimed to understand the process of reference recombination and identify their challenges. There were two distinctive stages of ideation: 1) conceptual ideation, which aims to convey the design topic effectively, and 2) visual ideation, which is about deciding style-wise details on top of the selected concept. We decided to focus on the conceptual one as the recombination of references tends to be more prevalent in this stage. During the conceptual ideation, designers extracted four types of elements from the reference—subject matter, action & pose, theme & mood, and compositional aspects (arrangement). Then, they tried to brainstorm more elements related to the extracted ones and combine them in several ways. However, due to the high effort required to recombine them manually, they were concerned that they could not try out all possible methods. They also mentioned that support for ideation should not be in an overly completed form as it can diminish their own input. With these observations, we propose four design goals for a reference recombination support system: (1) enable users to effortlessly specify the four types of conceptual elements from the reference image, (2) recommend relevant elements, (3) provide many recombinations as much as possible, and (4) intentionally keep the generated output partially unfinished to foster user’s creativity.
Based on the design goals, we propose CreativeConnect, a system that supports the design ideation process by helping users easily extract elements from the reference images and generate a wide range of recombinations of those elements. Using CreativeConnect, users can easily discover and select elements from the reference image based on the four element types and get recommendations for more relevant keywords. Once the user has chosen the keywords to combine, they get various recombination options presented as pairs of sketch images and one-line descriptions. We introduced novel pipelines with generative models to automate the extraction of keywords from images, generate recombination options, and transform them into descriptions and sketches.
We conducted a within-subjects study with 16 design students to compare CreativeConnect with the baseline system, which consisted of a mood board with manual keyword notes, layout diffusion model, and ChatGPT. Results showed that CreativeConnect could support both stages of the reference recombination process, discovering elements from references and generating design ideas by recombining them. Participants could also produce more design ideas in a given time and perceived that CreativeConnect helped them develop more creative sketches than the baseline. They emphasized that CreativeConnect was especially beneficial for getting inspirational ideas vastly different from their initial concepts. We compared the creativity support of CreativeConnect with the baseline and proposed an opportunity to design a comprehensive recombination support tool that could support a broad spectrum of design needs and situations. We also found that the low fidelity of sketch-based output led users to imagine more and get more stimulus for their creativity. Finally, we discussed the generalizability of CreativeConnect in terms of user expertise, collaborative settings, and different domains of design.
This paper presents the following contributions:
(1)
CreativeConnect, a system that supports graphic designers’ ideation process by helping extraction of elements from reference images and suggesting a wide range of recombinations of those elements.
(2)
Computational pipelines with generative models that extract and suggest keywords from images and generate recombinations of keywords in text descriptions and sketches.
(3)
Findings from a user study (N=16) about how CreativeConnect can aid designers in each step of recombination, leading to the generation of more design ideas and participants to perceive their ideas as more creative.
2 Related Work
This work aims to support designers in their reference recombination process for creativity. In this section, we review previous literature on (1) how references are used in graphic design ideation, (2) how recombination is employed for creative thinking, and (3) previous generative AI approaches for creativity.
2.1 Reference in Graphic Design Ideation
The creative process begins by collecting relevant inspirational materials from various sources [22, 77]. Designers leverage these collected examples to gain a comprehensive understanding of the problem space. As the process advances into idea generation, these compiled examples play a pivotal role in fostering creativity, igniting new ideas through analogical thinking [29, 34]. Recognized as one of the most challenging phases in the entire design process, previous research on creativity-supporting tools has extensively concentrated on enhancing this ideation step [25]. Previous research demonstrated that designers get valuable insights and inspirations in different ways [31], and many studies have delved into the significance of these references in design thinking, showing their potential to stimulate creativity and innovation [3, 78].
One of the primary approaches to support idea generation with references is to help designers see diverse references. Exploring diverse ideas is crucial in terms of preventing fixation [37], in which a designer becomes overly fixated on a single concept, potentially hindering creativity and innovation. Therefore, Zhang et al. [89] have utilized a Generative Adversarial Network (GAN) for exploring diverse images, while Matejka et al. [60] developed the Dream Lens to assist in exploring generative 3D design solution space.
Another avenue of research is to help designers manage their inspirations drawn from references, particularly through the use of mood boards [22]. Prior research demonstrated that building a mood board can enhance the comprehension and interpretation of ephemeral elements in design [27], which is beneficial for both defining and resolving design challenges [8] and ultimately leading to a boost in creativity [58]. Therefore, many computational systems have been proposed to help designers to build interactive mood boards, such as Funky Wall [59], SemanticCollage [49], and May AI [48].
While this paper primarily focuses on the recombination of references, we have integrated two significant insights from prior research about design references. First, we emphasize the importance of offering users diverse images to support their creative processes. Second, we have incorporated the concept of a mood board as a valuable tool for organizing references within our system.
2.2 Recombination for Creative Thinking
In the creative thinking process, new ideas often come through the combination of the existing examples [7, 79]. It was shown that creativity often arises from forging new associations among previously unrelated frames [50]. This process includes two crucial components: recognizing the differences between existing concepts and blending them [2, 64]. Also, the diversity of the given examples is important for building novel associations between them during this process [63]. Observations of designers’ creative processes showed that designers often maintain multiple small components and keep employing them to generate new variations through a process akin to recombination [24]. Many computational systems were also proposed for building recombinations and verified to be effective in tasks such as chair design [87] or text-based ideation [9].
One practical implementation of this concept in terms of design ideation is genetic exploration. Genetic exploration involves generating novel solutions by merging elements from preexisting designs to widen the range of references. This approach has been applied in diverse domains such as garden design [45], 3D modeling [14, 70], architecture [82], and 2D graphics [88]. However, these approaches primarily aim to enrich the reference in the information-gathering stage by utilizing existing references rather than supporting designers to generate their own ideas from those recombinations in the next stage.
In recombination, it is also critical to decompose the reference and get elements that are worth combining. Therefore, several tools have been developed to facilitate this process, especially by automatically decomposing the original source and showing the fine-grained aspects. CollageMachine [44] decomposes websites and makes them into an interactive collage. MetaMap [39] provides a decomposed view of the reference image into three dimensions (semantic, color, and shape) and lets users explore more references using it. Hope et al. [35] divides the product’s information into fine-grained functional parts, allowing users to combine the inspiring part. MoodCubes [36] offers a new mood board experience by decomposing multimedia references into constituent elements and using it to provide suggestions for new inspirational materials. They may not, however, directly discuss the exact strategies for merging these outputs as a new design idea. On the other hand, VRicolage [80] enables users to decompose objects into different parts, motions, or colors, and mix them. However, this process was more of utilizing collected assets rather than generating a new idea from recombination.
Additionally, many previous approaches supported the process of mixing the reference images or concepts. For example, VisiBlends [12] and VisiFit [13] introduced a novel pipeline to blend two objects to convey integrated meaning. ICONATE [91] supports users to generate a new icon by mixing different icons, and PopBlends [84] automatically suggests conceptual blends of reference images. FashionQ [38] supports this blending in the domain of fashion design. Artinter [15] supports recombining style elements from the reference to facilitate communication. Nevertheless, these approaches primarily focus on seamlessly merging entire references rather than breaking them down to the element level. This approach may not fully align with the creative recombination process, which often begins by identifying specific elements to combine within the provided examples. 3DALL-E [57] presents a recombination workflow for generating a new idea, which suggests diverse low-level keywords and combines them into a prompt for text-to-image models. This approach, however, differs from our definition of reference recombination as the keywords are from LLM’s understanding of the world rather than the design references.
2.3 Generative AI Approaches for Creativity
Before looking into AI systems for creativity, it’s important to know how visual designers perceive AI for supporting their design tasks. Ko et al. [47] looked into how graphic designers use large-scale text-to-image generation models (LTGMs) to help with their creative works and suggest the design guidelines for building creative supporting systems using them.
Recently, diffusion-based techniques [65, 72, 74] and CLIP embedding [71] enable people to represent their ideas to visual materials quickly and easily using text prompting. There were also many previous approaches to incorporate inputs with additional modalities, such as layout [10, 55, 92] or sound [53, 81]. Techniques to add extra conditions and styles for more granular control have been proposed as well [61, 90]. There is also a thread of research on modifying the generated image to align with user intent better, such as adding style [28], latent-space manipulation [42, 43], human-prompt editing [6], and editing a specific part in generated images [26, 75].
With those novel ML techniques, the creative landscape is continuously reshaped, offering innovative solutions and enriching the artistic experience. Promptify [5] stands out as an iterative prompt refine tool, letting users get closer to their intended result by clearing unintended outcomes. PromptPaint [16] allows users to go beyond language to mix prompts to express challenging concepts, supporting the iterative shaping of the image. On the other aspect, the interplay between humans and AI is also fast evolving. The concept that Karimi et al. [40, 41] proposes is a generative AI system that helps designers by collaborating during the design phase instead of taking over the design process. Oh et al. [66] and Framer [51] proposed a user-AI collaborative interface to allow a co-drawing experience.
While there has been a lot of research on expressing user intention to ML models accurately to get a better image or collaborate with AI during the design execution phase, it is less relevant to the ideation task of expanding the variety of ideas. Specifically, it still needs to be discovered how to design interaction with generative AI models to inspire graphic designers by recombining the references.
3 Formative Study
We conducted a formative study to understand how designers recombine design references for ideation and what challenges they encounter during the process.
3.1 Participants
The prior research [4] suggested that the less experienced tend to encounter more challenges in getting inspiration from references and combining them. Therefore, we targeted early-stage designers as they are expected to have an understanding of the overall design process but still struggle with many challenges in ideation through recombination compared to professional designers. We defined an early-stage designer as someone who got a design education in university or has less than 3 years of professional experience as a designer.
Six participants (6 female; age M=25.3 and SD=3.32) were recruited through an online recruitment posting. Two were professional UI/UX designers with 1 year of experience each, and one was a freelance brand designer with 3 years of experience. Three were students majoring in industrial design, with two at the graduate level and one in their fourth year of undergraduate studies. All participants reported that they had experience in at least three different graphic design projects before.
3.2 Study Process
The study included (1) an observation on reference searching and idea sketching and (2) a semi-structured interview. For the first part of the study, participants were asked to draw an illustration for one of three different design topics they chose: "Tourism service for kids," "Pet grooming service," or "Eco-friendly restaurant." They were first given 10 minutes to search for reference images that they wanted to use. For each reference they chose, they were asked to describe what aspects of the references they found appealing. Then, participants sketched their design ideas using their preferred method for 30 minutes. Three participants used pen and paper to sketch their ideas, while the other three used a tablet and digital drawing software. They were asked to generate at least three distinct design ideas and describe how they integrated their references into each sketch. After that, we conducted a semi-structured interview to ask about their challenges in generating multiple ideas using references.
After each study session, two authors independently coded the recombination methods the participants employed in their ideation tasks and the semi-structured interview results. The coded data were then discussed collaboratively. After conducting six studies, codes were saturated, and no further study sessions were conducted.
3.3 Findings
Through the observation of participants’ design processes, we discovered that the recombination of different references primarily occurs during the initial stages of design ideation, with a specific focus on the conceptual aspects of the reference images rather than the visual elements. We identified four distinct categories of elements employed in this process. We also found specific challenges associated with it and observed that the system supporting this process should reserve a degree of incompleteness to encourage creativity.
3.3.1 Early-Stage Design Ideation Focuses on Conceptual Aspects.
All six participants said they refer to the references in two distinct stages: conceptual ideation and visual development. During the conceptual ideation stage, designers focus on elements that could effectively convey the design topic, such as objects or mood. After looking at those elements, they generated multiple drafts by combining them in several ways. On the other hand, the visual development stage revolved around adding visual details like color and texture to complete the sketches derived from the conceptual ideation. During this stage, designers often had a clear direction in their mind and referred to a specific set of references that aligned well with their chosen direction, with less emphasis on exploring different recombinations of diverse references. This aligns with findings from previous research [33], which shows that artists engage in a spectrum of reference usage in their creative process, ranging from detailed recreation (visual development) by tracing images to interpretive inspiration for high-level components (conceptual ideation). In summary, designers recombined references primarily for conceptual ideation, which was usually the first step of the design ideation, suggesting that a system supporting the reference recombination process should focus on how to facilitate this early-stage step.
3.3.2 Types of Elements Used for Recombination.
During the conceptual ideation phase, participants tried to extract specific elements from references and incorporate them into their design concepts. They employed a variety of approaches for this. The simplest approach observed in all participants was utilizing objects in a reference in their sketch. For example, for drawing an illustration for "Tourism service for kids," one participant took an image of a paper plane from a reference to convey the image of playful children and tour service at the same time. Five participants extracted the abstract semantic meaning or overall theme conveyed by references. For example, after looking at an image of a person holding a pamphlet and deep in thought, one participant said that the keyword "imagination" could effectively capture the concept of kids. So, they developed a design concept about children imagining various travel destinations. Another approach observed in three out of six participants was to take the action of a character from a reference. For instance, by looking at a reference illustrating an animal and a person holding hands, a participant got the concept of children holding hands together. Lastly, five participants referred to the composition from the reference images. For example, by looking at a reference where leaf shapes were arranged together to form a shovel, one participant came up with the idea of using multiple tree trunk shapes to represent the structure of a building.
3.3.3 Challenges During Finding Elements.
We identified some opportunities to support the process of extracting elements from the references. There were many cases where the elements designers initially found appealing in the reference search phase differed from those they eventually utilized in their design concepts. In the interview, participants said that upon closer examination of the references, they discovered new elements of interest and incorporated them. This means that designers couldn’t immediately extract elements upon viewing the reference, and it often required several examinations to uncover such elements, which was time-consuming.
Another observation was that participants often came up with new keywords based on what they had already found for further brainstorming at the element level. For instance, P3 identified "toy blocks" from one reference and "train" from another reference, then came up with the new keyword "toy train" and incorporated it into their final idea. However, this process was often more challenging than finding elements directly from the reference images. P4 highlighted an opportunity for system support this thinking process by mentioning that “I usually talk with others about my ideas, which leads me to discover new keywords related to the original one. Just like that, I think it would be nice if the system could recommend a new keyword to expand my current design idea.”
3.3.4 Challenges During Recombining Elements.
After finding out the elements from the references they want to utilize in their design ideas, another challenge became apparent. While there can be numerous ways to combine these elements, participants were often frustrated as they couldn’t sketch out all the possibilities to determine if they were viable. Three out of six participants expressed anxiety about not being able to consider all possible combinations. P3 stated, “I always feel anxious that there might be a better way, but I can’t think of it.” P6 also mentioned that “The more options I explore, the more I become confident about my final design idea. I want some faster way to explore alternatives as much as possible.” Four out of six participants said they rely on their imagination to envision numerous recombination possibilities within their minds, as sketching out all is too time-consuming and effortful. However, two participants expressed frustration that, although combinations seemed good in their minds, they might not come together as effectively in actual sketches.
3.3.5 System Support should be Incomplete.
Designers tended to deliberately exclude visual details during conceptual ideation. Participants said that when recombining the references for conceptual inspiration, they did not pay much attention to visual details, and several participants noted that they even needed to exclude those details intentionally. P2 stated, “When combining different concepts, colors and textures often become messy, so I deliberately use the same brush for all elements.” P3 agreed with another viewpoint by expressing concern about becoming overly fixated on frequently recurring visual details while exploring conceptual recombinations. We also asked the participants which form would be preferred if they could get recommendations for different recombination options. Four participants mentioned that they would prefer incomplete outputs, such as a sketch or even a textual description of the idea so that they could focus on the concept itself. The main reason for this was the concern that the model would compromise their creativity or lead them to perform unintentional plagiarism.
3.4 Design Goals
Based on the findings of the formative study, we identified four design goals to build a system to support designers’ reference recombination process during early-stage ideation.
DG 1.
Facilitate Element Extraction from References. To help users efficiently find the elements that would be used for the recombination, the system should help users discover the overlooked elements. Based on our observation, elements that users want to extract from references are (1) subject matters (e.g., objects, characters, landscapes), (2) action & pose, (3) theme & mood, and (4) arrangement.
DG 2.
Suggest Diverse and Relevant Elements. To help users explore more elements on top of what they found from the references, the system should provide some recommendations of relevant elements that users might like.
DG 3.
Generate Diverse Recombination Options. To help users explore diverse recombination possibilities, the system should show users a varied range of recombination options and reduce their anxiety over not considering all feasible combinations. This goal highlights the system’s ability to propose combinations that users might not have considered independently.
DG 4.
Present Recombination in an Incomplete Format. To align with designers’ preference for conceptual sketches over highly detailed artwork during the initial ideation phase, the system-generated outputs should be intentionally incomplete, such as sketches. This emphasizes the importance of allowing users to inject their own creativity into the images.
4 CreativeConnect
With derived design goals, we implemented CreativeConnect (Figure 2), an AI-powered design tool that supports graphic designers in coming up with novel design ideas by recombining reference images in early-stage conceptual ideation. CreativeConnect mainly consists of a mood board where users can import reference images and select what they like about the reference. When the user imports a new image, the system extracts keywords according to the four categories defined in the formative study (Section 3.3.2) so that users can choose among them. This helps users easily discover and select keywords (DG 1). Selected keywords are then displayed on the mood board along with the images. CreativeConnect offers further keyword recommendations based on the keywords users have added to the board or their specific selections (DG 2). Also, when the user chooses a set of keywords to recombine, the system generates multiple drafts with diverse ways of combining them (DG 3). All system-generated recombination outputs are produced in line sketches with one-line descriptions so that users can further reinterpret by themselves (DG 4).
Figure 2:
4.1 User Scenario
To demonstrate our system, we show how Sarah, a junior illustration designer, uses CreativeConnect to generate ideas for her design project. Sarah recently accepted a new commission to draw an illustration for the cover of a children’s book titled, "A Christmas Dinner in the Underwater World." As the given topic is an unusual combination of two themes, she struggled to get inspiration from the references and mix them to come up with ideas, so she decided to explore references with the help of CreativeConnect.
4.1.1 Getting User Inputs on the Design Reference.
Sarah first uploads ten reference images she got from her client into CreativeConnect. Looking through the references, she is intrigued by the one where two scuba divers swim with a turtle. When she chooses the image, CreativeConnect shows some keywords that can be found in the image, divided into four categories – subject matter, action & pose, theme & mood, and arrangement (Figure 2 (a)). As she finds the scuba diver concept interesting, she clicks on the subject matter category. She finds "scuba diver" in the keyword list and clicks it. She also finds "coral reef" on the list, which she didn’t recognize before. She looks at the references again and thinks coral reefs would look great in her illustration, so she clicks "coral reef" as well. Similarly, she looks through the list of the keywords in the "action & pose" and "theme & mood" categories and selects "swimming" and "adventure" from each list. She also likes the overall composition of the image, so she clicks its "arrangement" as well. She also works on selecting keywords that she likes on other references.
4.1.2 Mood Board with the User-selected Keywords & Keyword Recommendation.
As Sarah selects the keywords she finds useful from each image, the canvas of the CreativeConnect offers a dynamic mood board that shows the references with user-selected keywords, capturing her creative goal and preferences (Figure 2 (b)). As she freely moves the images to organize them, the selected keywords move along with the image. By looking at the keywords, Sarah wants to come up with additional ideas for character actions that align with the adventurous theme, similar to swimming or scuba diving. Therefore, she selects "subject matter: scuba diver", "action & pose: swimming", and "theme & mood: adventure" to get system recommendations with these keywords. CreativeConnect shows a set of keywords, such as "action & pose: exploring sunken ship", and "subject matter: anchor". Sarah finds those keywords valuable, so she drags them into the mood board.
4.1.3 Recombining Design References using Keywords.
From the set of keywords on the mood board, Sarah now selects some keywords she wants to include in her design and uses the system to make a first draft. She selects "Christmas tree" and "Santa Claus" for a Christmas dinner theme, and "whale", "swimming", "exploring the sunken ship", and "adventure" for the underwater theme. She also selects the "arrangement" of one of the images with an interesting composition.
After clicking the merge button, CreativeConnect generates three different drafts, each showcasing a unique and different way of incorporating these keywords (Figure 2 (c)). Each draft contains a one-line text description of the image concept and a sketch-style image generated based on the description and the arrangement that Sarah selected. She appreciates the results as the way each draft combined keywords would be difficult to think of by herself and that all three drafts are distinct from each other. Also, the sketch format allows her to imagine further design concepts rather than fixating on the concept and details in the generated results.
Among the drafts, Sarah finds one description interesting: "Santa Claus goes on an underwater adventure on a sled pulled by a whale." However, she feels dissatisfied with the generated sketch and presses the "More Sketches" button. Then, CreativeConnect generates five more sketches with the same description but in a slightly different way. She gets some good design ideas from the new sketches and starts working on her draft.
4.2 Technical Details
CreativeConnect was built as a web-based system with a ReactJS1-based front-end client and a Flask2-based back-end server. We implemented ML pipelines for extracting the keywords from the references and merging keywords into recombinations. The technical details of these pipelines are discussed in the following sections. Some examples of outputs from the pipeline are presented in Figure 9 in the Appendix.
Figure 3:
4.2.1 Extracting Keywords from Reference Images (Figure 3. (a)).
Based on the findings from our formative study, our pipeline is designed to extract keywords from a provided reference image in four categories: subject matter, action & pose, theme & mood, and arrangement. To achieve this, we follow a multi-step process.
To identify the subject matter, action & pose, and theme & mood within the image, we employ an image captioning model BLIP-2 [54] to generate textual descriptions of the image contents. For a comprehensive understanding of the entire image, we divide it into 3 × 3 segments and generate captions for each segment as well as the whole image. These captioning results are then processed by GPT-4 [68], a Large Language Model (LLM), to extract lists of subject matter, action & pose, and theme & mood present in the image captions. Prompts used for this are in Appendix A.1.
For the arrangement, we utilize the Segment Anything model [46] to generate segments and then identify the top ten prominent segments within the image using the approach from LLM-grounded Diffusion [56]. Bounding boxes around these segments provide information about the image’s overall structure, such as where the items are placed and where large negative spaces are.
Additionally, for generating the recommendations of the relevant keywords, we use GPT-4, and the prompts used for this are shown in Appendix A.2.
When the user selects a set of keywords to generate a new recombination, our system generates a range of options to mix those keywords.
The system first generates three textual descriptions encompassing the selected subject matter, action & pose, and theme & mood keywords. Then, it extracts the list of the objects that must be drawn on the image for this description. We use few-shot prompting with GPT-3.5-turbo [67] for this. The prompt used is in Appendix A.3. For the arrangement, we developed a layout variator to create layouts similar to the selected image’s arrangement while aligned with the generated text description. The layout variator first applies an empirically defined random variation of -50 to 50 pixels on each bounding box component (i.e., x, y, w, h) in the original arrangements. Then, it randomly selects boxes depending on the number of objects that need to be drawn and sorts highly similar layouts first using the similarity metric. The similarity is calculated by summing the IoU and the complement of the min-max normalized centroid distance between the closest pairs of bounding boxes. Following this similarity, the top five arrangements are utilized for the recombination generation. The most similar layout is used for generating the image in the initial iteration, and other layouts are used when the user requests more sketches. A few shot prompting with GPT-3.5-turbo is used to map between the arrangements and the objects to create the best image possible. The full prompt is shown in Appendix A.4.
However, when the user does not select any arrangement from the references, the system generates a broader range of diverse layout options. A few-shot prompting pipeline using GPT-3.5-turbo generates the three most appropriate layouts for the given text description and object list. This pipeline is built based on the previous work [56], and the full prompt for this is in Appendix A.5.
Given the textual description and the list of the objects mapped with the generated layout, the system generates images with a layout diffusion model [55]. Following our design goal, the system converts the generated image into a simple line sketch using the U-Net structured style transfer model [20].
4.3 Technical Evaluation
We evaluated ML-based pipelines, especially for keyword extraction, keyword recommendation, and textual description generation by merging keywords.
4.3.1 Keyword Extraction Pipeline.
We built a dataset of 100 images with tags categorized by the subject matter, action & pose, and theme & mood. We asked 20 people with expertise in design or HCI to annotate five images each. On average, 5.03, 1.87, and 2.29 keywords in the category of subject matter, action & pose, and theme & mood, respectively, were collected per image.
Using this dataset as ground truth, we evaluated the prediction result from the keyword extraction pipeline. Keywords in subject matter and action & pose categories were matched manually one by one between similar ones. The precision and recall of our pipeline were 94.2% and 58.2% in subject matter, and 35.3% and 51.3% in action & pose. Although some salient keywords in the dataset were missing, the pipeline provided quite accurate keywords in the subject matter. The predicted action & pose keywords were not perfectly aligned with the dataset tags, but they were still acceptable on the user side because they were perceived as similar to users even if they were not completely accurate (e.g., for an image of a cat standing straight, our pipeline predicted "stretching arms", while the ground-truth is "dancing"). For theme & mood keywords, we calculated the cosine similarity of mean embedding vectors [85] of ground-truth and predicted result to compare the semantic similarity. This was because for theme and mood, even if words are not exactly the same, there can be many other words that can be accepted as similar. The similarity of the ground truth and prediction was 0.826, which means the keyword extraction model estimates the theme & mood words quite closely. Examples of the predictions are presented in the Appendix (Figure 10).
4.3.2 Keyword Recommendation Pipeline.
We evaluated the keyword recommendation pipeline based on whether there was a proper level of similarity between the original keywords and the recommended keywords. This was because it would only be effective if the recommendations were not too similar or irrelevant to the original keywords.
We randomly sampled three to ten keywords from each image-keyword pair in the dataset and made 100 sets of keywords. Then, from the pipeline, we got the recommendations for each set. To verify whether these recommendations have a proper range of diversity, we generated two comparison groups of keywords: the irrelevant group and the synonym group. The irrelevant group consists of random keywords from the dataset, and the synonym group is generated by paraphrasing the keyword in each set. NLTK [1] and GPT-3.5 were employed to find synonyms. Then, we used the text embedding [85] to calculate the cosine similarity of each group with the original keywords.
The similarity of the irrelevant and synonym groups to the original keywords was 0.624 and 0.774, respectively, and the recommended keywords had a similarity of 0.696, which is in the middle. This shows that our recommendations are less similar to original keywords than the synonym group but more similar than the irrelevant group.
4.3.3 Recombination Generation Pipeline.
The recombination generation pipeline gets a user selection of a set of keywords and generates three different descriptions of the possible image that includes those keywords. As the pipeline aims to provide diverse options, we evaluated the diversity of the description generation model.
Similar to section 4.3.2, we built 100 sets of keywords randomly extracted from the dataset. We generated three descriptions using our pipeline for each set, calculated the cosine similarities between those three, and averaged them. Here, we calculated diversity as 1 − similarity. To validate our description generator, we prepared two more description sets, one consisting of explicitly unrelated descriptions randomly acquired from the dataset, and the other one consisting of descriptions that merely paraphrase one of the generated descriptions using paraphraser with T5-based model [83]. The diversity within the random and paraphrased groups was 0.801 and 0.209, respectively, while the generated descriptions from our pipeline show a diversity of 0.395. This indicates that generated output is more diverse than just paraphrasing and less diverse than random ones, which means that the pipeline generates descriptions of a reasonable amount of diversity.
We didn’t evaluate the later part of this pipeline, which is about generating images and transforming them into sketches, as we used models from previous research [56] without any customization or adaptation.
5 Evaluation
We conducted a within-subjects comparative study with 16 participants. As our design goals encompassed two steps of the reference recombination – (1) Finding elements (DG 1 and DG 2) and (2) Recombining elements (DG 3 and DG 4), we first observed how CreativeConnect supported each of these steps. We also evaluated whether CreativeConnect eventually improves designers’ idea generation results and how it supports the creative process.
•
RQ1. How does CreativeConnect support the two steps of the recombination process—finding elements from the references and recombining elements?
•
RQ2. Can CreativeConnect help users generate better quality and quantity of design ideas?
•
RQ3. How do users utilize the output of CreativeConnect in their ideation process?
The baseline system shared a similar interface with CreativeConnect but without the key features of CreativeConnect—extracting keywords from the reference, suggesting relevant keywords, and generating recombination options. In this baseline system, users could manually leave keyword notes on each reference image, create sketches by specifying layouts and prompts to the image generation model, and use ChatGPT3. To assess the efficacy of the design of CreativeConnect’s features and pipelines rather than the effect of AI functionalities, the same AI functionalities are also included in the baseline system. After observing prevalent use cases of AI in design processes through recent survey [69] and videos [21, 76, 93], we included both the language model and the image generation model in the baseline system to simulate real-world scenarios of designers with AI tools. The baseline included a model closely aligned with the CreativeConnect pipeline to prevent the model performance from affecting the study results. Instead of the GPT models, we provided GPT-3.5-based ChatGPT, and for image generation, we offered the same layout diffusion model as CreativeConnect. The screenshot of the baseline interface is presented in Figure 11 in the Appendix.
5.1 Participants
We recruited 16 participants (10 females, 6 males; age M=24.81 and SD=3.78) through an online recruitment posting. To determine whether the CreativeConnect can handle the challenges found in the formative study with early-stage designers, our participants were set as a group similar to the formative study. We required participants to have a degree in design or art and have participated in at least three different design projects. 11 participants were students majoring in design—5 were at the graduate level, and 6 were at or above the third-year undergraduate level. The other 5 participants have graduated—2 majored in design, 1 minored in design, while others pursued majors in media arts and painting.
All participants also reported having enough sketching skills since we asked them to draw their design ideas during the task. The study was conducted for 2 hours, and we compensated participants with 70,000 KRW (approximately 53 USD).
5.2 Study Procedure
Figure 4:
The whole process of the user study is shown in Figure 4. Participants were asked to perform design ideation tasks twice in two settings: CreativeConnect and baseline. The task was to draw an illustration for the cover of a fictional children’s book, "Starry Safari: Exploring Alien Jungles" or "A Christmas Dinner in the Underwater World". They were also provided with 10 reference images for each topic. The order of topics and tools was counterbalanced for each participant.
For the first five minutes of each round, participants had a tutorial on the given system and tried it out with sample images to get used to it. They were then given the topic and the reference images and started ideation using the tool for 30 minutes. If the participants came up with a design idea they wanted to develop further, they sketched it on the paper using a pen. After each round, they completed the post-task survey. Between the two rounds, they could get a 10-minute break. After both rounds, we conducted a 20-minute semi-structured interview to ask about the difference between the two conditions and the effect of the tools on their ideation process. The interview questions are in Appendix B.2.
5.3 Measures
The survey after each round included questions about the usefulness of the given system for the different steps of the ideation: organizing the references, discovering useful elements from the reference, exploring multiple ideas, discovering new ideas, and exploring multiple ideas. The survey also included five questions about satisfaction with participants’ sketch results regarding overall outcome, quantity, quality, diversity, and creativity. We also had five questions from [86] to assess participants’ self-perceived experience using the AI system. Participants answered these questions for the image generation feature and ChatGPT after the baseline session, and for the keyword extraction, keyword recommendation, and image/description generating features after the CreativeConnect session. Also, the survey included the Creativity Support Index [11] and NASA-TLX questionnaire [30].
We also gathered the usage logs (i.e., participant actions with timestamps) to get quantitative metrics for user behaviors. We used this data to calculate the time taken for each sketch, the number of images generated, the number of inputs provided to the image-generating model, etc. Also, every time the participants completed the sketch, the system prompted participants to rate how well the given tool assisted them in producing the idea.
Additionally, we conducted an expert evaluation of the participant’s sketches. We recruited two experts with bachelor’s degrees in art and had 6 and 1.5 years of experience teaching art each. We asked them to evaluate two factors in the 7-point Likert scale: (1) the creativity of each sketch and (2) the diversity of ideas within a set of sketches. We randomly chose three sketches drawn by each participant on each design topic, and a total of 96 sketches (3 sketches x 16 participants x 2 conditions) were evaluated. The evaluators rated the sketches individually, and for cases of significant score differences (more than 3 points), we asked evaluators to re-evaluate them. While re-evaluating, they were given each other’s comments and scores and could choose to change their original score or leave it. They also had to leave comments about their decision as well. There were 9 sketches that required re-evaluation, and all of the conflicts were resolved after one round of re-evaluation. After that, we used the average score for the two evaluators’ scores for the result analysis.
6 Results
Results showed that CreativeConnect helped participants both find and recombine elements for reference recombination. Also, it was shown that users with CreativeConnect could generate more design ideas in a given time and perceived their ideas as more creative compared to the baseline. We also found some differences between CreativeConnect and baseline regarding how users utilize the tool for their creative process.
Figure 5:
6.1 Support for Different Recombination Steps
To answer RQ1, we examined survey questions and log analysis results divided into two steps of reference recombination: (1) discovering keywords from the reference images and (2) recombining the found elements into a new concept. We used a Wilcoxon signed-rank test for all survey questions, as they were ordinal data on a 7-point Likert scale. For the usage log analysis, we conducted a two-sample t-test or two-sample paired t-test to compare between CreativeConnect and baseline.
6.1.1 Finding Keywords from the Reference.
Participants perceived that CreativeConnect helped discover valuable keywords from the given reference images. As shown in Figure 5, participants found out that CreativeConnect is significantly more helpful in discovering valuable elements from the reference that can be used for their ideation (M=6.13, SD=1.31) compared to the baseline system (M=3.75, SD=1.98 / p=0.001, W=0.0). Regarding how CreativeConnect and baseline helped with organizing references, the rating was not significantly different, but with a slightly higher average rating for CreativeConnect (Baseline: M=4.38, SD=1.96 / CreativeConnect: M=5.31, SD=1.58 / p=0.121, W=23.5).
Usage logs also showed that CreativeConnect effectively encouraged participants to explore and extract different keywords. In comparing the numbers of the keyword notes that participants left in both conditions using a two-sampled paired t-test, participants with CreativeConnect added more keyword notes (M=34.69, SD=10.74) compared to the baseline system (M=13.19, SD=10.53 / p<0.0001, t=5.52). Also, as shown in Figure 6, participants with baseline typically extracted keywords exclusively during their initial sketch, thereafter relying solely on the previously extracted keywords without actively discovering additional keywords. In contrast, participants using CreativeConnect consistently added more keywords throughout the whole process. While they also extracted the most keywords at the beginning, they continued to extract new keywords from references for every new sketch. One participant (P15) drew all sketches in one go after developing multiple design ideas, instead of sketching immediately after formulating each idea. As we cannot match keyword notes with each specific sketch in this case, this data was omitted from this analysis of actions associated with each sketching instance. All participants’ raw usage log data, including P15, is provided in the Appendix B.3.
Figure 6:
6.1.2 Recombining elements.
The survey’s findings indicated that the CreativeConnect can be useful for recombining different elements into new design ideas. Participants said that CreativeConnect is significantly more helpful (M=5.94, SD=1.34) than the baseline system (M=4.88, SD=1.89 / p=0.023, W=10.5) for them to generate multiple ideas from the collected elements (Figure 5). However, participants’ perception of how much the system helped explore multiple ideas was not significantly different in both conditions, although CreativeConnect had a slightly higher average rate (Baseline: M=4.88, SD=1.93 / CreativeConnect: M=5.69, SD=1.35 / p=0.178, W=22.0). Also, it was not significant in terms of discovering novel ideas, but the average was slightly higher in CreativeConnect (baseline: M=5.00, SD=1.75 / CreativeConnect: M=5.75, SD=1.48 / p=0.110, W=19.0).
We also examined how participants used the given image generation model to recombine elements into a design idea. As shown in Table 1, there was no significant difference in the number of generated images between the two conditions. However, in exploring diverse recombinations using the model, CreativeConnect showed particular advantages, as evident from the unique patterns observed when users interacted with the generation model under two conditions. Users could provide separate inputs for overall image descriptions and each object using the baseline system. CreativeConnect allowed users to select multiple keywords to merge. In both conditions, participants could input multiple phrases together to combine them. We analyzed how diverse phrases are given as a single input into the model. Out of a total of 347 input sets (202 from the baseline, 145 from CreativeConnect), 14 sets (11 from the baseline, 3 from CreativeConnect) consisted of only one input, and they were excluded from the analysis since our objective was to compare the semantic similarity between phrases provided to the model together. For the remaining 333 input sets (191 from baseline, 142 from CreativeConnect), we computed the semantic similarity between all pairs of phrases within each input set and calculated the mean and minimum similarity. The mean similarity represents the overall similarities between phrases provided as input together, while the minimum similarity represents the most diverse pairs within the set. Finally, we conducted a two-sample t-test for each metric.
As shown in Table 1, the input sets created within CreativeConnect showed significantly lower similarity between the keywords (M=0.222, SD=0.094) compared to the sets made within the baseline system (M=0.263, SD=0.166 / p=0.008, t=2.66) when they are calculated based on the minimum similarity. This difference is also similar when they are calculated based on the mean similarity, but it was slightly not significant (Baseline: M=0.356, SD=0.148 / CreativeConnect: M=0.330, SD=0.075 / p=0.051, t=1.95). This means that participants with CreativeConnect actively sought to create unique recombinations with greater semantic diversity, ultimately exploring diverse and distinct recombinations compared to the baseline condition.
Table 1:
CreativeConnect
Baseline
Statistics
mean
std
mean
std
p
Sig.
Image Generation Model Usage (Per session)
# of generated image
57.06
17.91
46.69
23.52
0.119
-
# of user inputs to the model
9.31
4.57
10.56
4.76
0.468
-
Semantic Similarity within Input Sets
Semantic Similarity (Mean)
0.330
0.075
0.356
0.148
0.051
-
Semantic Similarity (Min)
0.222
0.094
0.263
0.166
0.008
⁎⁎
Table 1: Number of image generation model usage and the semantic similarity between user inputs in CreativeConnect and baseline. (-: p > .05, ⁎: p < .050, ⁎⁎: p < .010, ⁎⁎⁎: p < .001)
6.2 Ideation Results
To answer RQ2, we analyzed the design idea sketches that participants drew during the study session through expert evaluation, usage log, and survey results. Similar to the RQ1, the Wilcoxon signed-rank test was used for survey questions. We used a two-sample t-test for expert evaluation and log analysis results. For pairwise data, such as comparing the number of sketches drawn in each condition by each participant, we conducted a two-sample paired t-test.
6.2.1 Creativity & Diversity of the Final Sketches.
Figure 7:
As shown in Figure 7 (b), the survey results showed that participants perceived their sketch as more creative when they were using CreativeConnect (M=5.38, SD=1.09) compared to the baseline (M=4.19, SD=1.64 / p=0.004, W=0.0). During the interview, 12 out of 16 participants said that they felt they could be more creative with the support of CreativeConnect rather than the baseline, especially when they’re having a hard time coming up with a new idea in the early ideation stage. There were no significant statistical differences between the two conditions in terms of other factors, including overall satisfaction, quantity, quality, and diversity of the sketches.
However, as shown in Figure 7 (a), the expert evaluation does not show a significant difference between the two conditions. The creativity score of the expert evaluation was slightly better in CreativeConnect (M=4.854, SD=1.418) compared to the baseline (M=4.344, SD=1.708 / p=0.114, t=1.59), but it was not significant according to the two-sample t-test results. There was also no significant difference in diversity (Baseline: M=4.625, SD=1.607 / CreativeConnect: M=4.75, SD=1.418 / p=0.833, t=0.23). There were possible reasons that expert evaluation was different from the survey results. First, even though the experts were asked to focus on the idea as much as possible, the participants’ sketch skills were inevitably reflected in the evaluation, and some of the comments left by the evaluators were actually about the sketch skills. There is also a possibility that deviations according to the design topic may have been affected. In fact, sketches about the topic of underwater Christmas were rated higher on average.
6.2.2 Efficiency of the Ideation Process.
Table 2:
CreativeConnect
Baseline
Statistics
mean
std
mean
std
p
Sig.
# of sketch per session
5.56
1.63
5.06
1.73
0.041
⁎
time per sketch (min)
5.01
2.87
5.39
3.03
0.403
-
Table 2: Number of sketches drawn by the participants per session and the average time taken for sketches. (-: p > .05, ⁎: p < .050, ⁎⁎: p < .010, ⁎⁎⁎: p < .001)
As shown in Table 2, the two-sample pairwise t-test result showed that participants came up with more sketches in the same 30-minute ideation session with the support of CreativeConnect (M=5.56, SD=1.63) than with the baseline (M=5.06, SD=1.73 / p=0.041, t=2.24). This result indicates that CreativeConnect can be helpful for efficient ideation. The interview results also demonstrated that CreativeConnect could be useful when they have to come up with a lot of ideas in a limited set of references and time, which is a common scenario in professional design tasks where clients provide references and designers must provide drafts with them.
6.2.3 Perceived Workload.
As shown in Table 4, there was no difference between the two conditions regarding the perceived workload. While CreativeConnect has additional complications, such as requiring users to specify keywords to give inputs to the image generation model, this does not cause users to feel overwhelmed while performing the task.
6.3 Impact on User’s Creative Process
6.3.1 Source of the Inspiration.
Table 3:
CreativeConnect
Baseline
Source of Inspiration
Within the tool
Generated image/description
9
5
Recommended keywords
1
-
ChatGPT answers
-
2
Outside of the tool
Own creativity
1
1
Reference images
5
8
Avg. tool assistance rating
5.625
4.563
Table 3: Number of inspiration sources by category for the most creative sketches chosen by the participants and the average rating of the efficiency of the tool assistance for drawing those sketches. Figure 8 illustrates the example use cases of inspiration within the tool.
To investigate how users use the output from CreativeConnect and baseline system for generating new design ideas differently, we asked participants to pick one sketch they think is the most creative for each study session and explain how they got the inspiration for it.
Five different inspirational sources were found in two conditions. Many participants got their ideas from the generated images or text descriptions in both conditions. In CreativeConnect condition, more than half of the participants said that their best ideas are inspired by these generated images or text (Table 3). As illustrated in Figure 8 (a), participants utilized keywords from both reference images and recommendations, and merged them using the system. Notably, they got the generated images and tried to reinterpret them in their own way rather than accepting what was drawn there. Participants with the baseline system were also influenced by the images generated, but the number was slightly less (Table 3), and how they were influenced was slightly different. They tend to refer to the visual compositions or details of the shapes and apply them to their sketch. P7 mentioned the reason for this, “While putting prompts into the image generation model (in the baseline), I already had the concept I wanted. Therefore, I refer to the expression method of it, rather than trying to find something new out of it.”
One noticeable thing is that participants were influenced more by the given reference images when using the baseline. This shows that CreativeConnect can make users less directly affected by reference images, ultimately preventing them from fixating on them. P16 explicitly pointed out this by saying, “When using CreativeConnect, I gave less focus to given images, and as I can expand to a lot of ideas only with a small number of references, I didn’t even use all of them.” P14 mentioned, “This (baseline) tool feels like a notepad that manages references, so I kept referring to the reference images themselves."
As shown in Figure 8 (d), there was also a participant who got an idea from CreativeConnect’s recommended keywords. In baseline, instead of this keyword recommendation feature, they could use ChatGPT, and 2 participants said that they got their inspiration from this. However, this usage was relatively small (Table 3), mainly because of the challenges of using it for visual tasks. During the interview, participants mentioned difficulties in formulating prompts and leveraging the language-based output for their design.
This difference in sources of inspiration affected the results of the user’s rating of how effective the assistance of the tool was. We conducted a two-sample pairwise t-test to compare participants’ ratings on the tool’s usefulness for generating their favorite ideas. The rating was higher in CreativeConnect (M=5.63, SD=1.41) compared to the baseline (M=4.56, SD=1.89 / p=0.045, t=2.18) (Table 3), indicating that users perceived the features of CreativeConnect are more helpful in coming up with their best ideas, compared to the baseline system.
Figure 8:
The survey results about the perceived experience of using the AI-based system also showed a more specific reason for this helpfulness. As shown in Table 4, CreativeConnect is shown to be significantly better for thinking through what kind of outputs users want to complete for the given task (baseline: M=5.00, SD=1.97 / CreativeConnect: M=6.13, SD=1.02 / p=0.045, W=14.5). This shows that participants don’t think of the results of the CreativeConnect’s image generation model as their final results but more as a guide to thinking about what they want. It leads users to think in diverse ways. P9 mentioned that “In baseline, the result came out exactly what I thought, so I replicated the output. However, CreativeConnect shows me various high-level ways to combine things so I could explore those methods and expand those processes on my own.”
Table 4:
CreativeConnect
Baseline
Statistics
mean
std
mean
std
p
Sig.
Self-perceived experience on ML model
Match goal
5.00
1.63
4.63
1.96
0.5805
-
Think through
6.13
1.02
5.00
1.97
0.0454
⁎
Transparent
4.81
1.80
4.38
1.67
0.4488
-
Controllable
4.75
1.95
4.06
1.84
0.2976
-
Collaborative
5.38
1.59
4.94
2.08
0.4809
-
NASA-TLX
Mental
3.69
1.82
4.19
1.94
0.39
-
Physical
1.81
1.22
2.50
2.10
0.10
-
Temporal
2.81
1.83
3.50
2.28
0.23
-
Effort
3.63
1.82
3.94
2.05
0.63
-
Performance
5.31
1.08
5.06
1.39
0.78
-
Frustration
2.63
1.93
3.50
1.75
0.14
-
Creativity Support Index
Enjoyment
5.91
1.00
5.09
1.78
0.077
-
Exploration
5.38
1.54
4.81
1.56
0.211
-
Expressiveness
5.44
1.18
4.53
1.75
0.032
⁎
Immersion
4.69
1.99
4.69
1.82
1
-
Results Worth Effort
5.47
1.27
5.25
1.71
0.591
-
Collaboration
5.19
1.25
4.41
1.71
0.016
⁎
Table 4: Survey results of self-perceived experience on ML features, NASA-TLX questionnaire, and Creativity Support Index. (-: p > .05, ⁎: p < .050, ⁎⁎: p < .010, ⁎⁎⁎: p < .001)
6.3.2 Creativity Support Index.
According to Table 4, users prefer CreativeConnect significantly more than the baseline regarding expressiveness and collaboration. Still, the other criteria showed no significant difference between the two systems. Through the post-interview, we found out that participants felt different types of creativity support in each system. Participants said that the baseline was helpful when they had an overall idea in their mind and wanted to get support for expressing it in the sketch. On the other hand, participants said that the CreativeConnect is helpful for their creativity when they have no idea yet. These differences will be explained in more detail in section 7.1.
7 Discussion
We propose a novel AI-infused creativity support tool CreativeConnect, which assists graphic designers in generating their design ideas by recombining reference images. Based on our findings, we suggest some design implications for future creativity support tools.
7.1 CreativeConnect vs. baseline - Two Different Types of Creativity Support
The results show that CreativeConnect successfully supports the early-stage conceptual ideation with reference recombination process by aligning well with the four design goals we derived from the formative study. Participants could easily extract keywords (DG 1) and utilize keyword recommendations as a source of new inspirations (DG 2), leading them to make more keyword notes. Also, they explored diverse keyword recombinations (DG 3), leading them to make more design ideas in a given time. Additionally, they perceived their idea as more creative as CreativeConnect provided the output as an incomplete sketch and let participants inject their creativity into it (DG 4). However, participants didn’t feel the difference in the overall degree of creativity support between the two tools. The interviews revealed that this was because CreativeConnect and baseline both provided valid creativity support, but in a distinct way based on users’ current needs.
In the baseline system, users should specify all the details of the generated image, so they appreciated the transparency and control. The system faithfully reproduced user input by that control, resulting in a final output that closely mirrors the concept in their mind. These generated outputs helped users actualize their existing ideas, more supporting implementation [17, 18]. Sketch-Sketch Revolution [23] or Framer [51] had a similar approach to creativity in terms of this.
Conversely, CreativeConnect stimulates creativity by providing inspiration [17, 18]. Instead of requiring users to provide detailed input, CreativeConnect accepts keywords and deliberately refrains from exact expression, generating a wider range of outcomes, potentially with serendipity. How CreativeConnect can provide participants with this creative leap can be explained by Cross’ descriptive model of creative design [19]. The keyword extraction feature actively supports emergence, allowing designers to find unrecognized properties of the existing design. The keyword recommendation also supports mutation, helping designers to generate new ideas by modifying existing designs partially. P15 metaphorically likened this process to having someone nearby constantly talking with them with fresh variations of ideas. Furthermore, the keyword merging feature enhances combination, where new ideas are generated by combining features from existing designs. Therefore, CreativeConnect could be potentially helpful for addressing a common challenge known as “artist’s block” or “creative block”, similar to the “writer’s block” experienced by writers [32]. CreativeConnect could provide proper support when designers find themselves creatively stuck, breaking creative inertia by sparking novel ideas and opening new creative avenues.
These differences can be valuable design implications for future creativity support tools as designers require different types of creativity support in different stages of the ideation process. By dynamically adjusting the type of support based on the user’s context, such a tool can offer a more personalized and practical creative experience. For instance, when the system detects a user in the exploration phase, it can employ an approach similar to CreativeConnect, encouraging the generation of diverse and abstract ideas. Conversely, when the user wants to refine and develop a particular concept, the tool can provide baseline-like features to ensure greater control and fidelity in the generated output. This adaptable approach acknowledges the multifaceted nature of the creative process and supports users with the right tools at the right moment, ultimately enhancing their creativity. Also, integrating those inspiration and implementation support into a single tool can enable a seamless transition between generating diverse ideas and refining specific concepts, fostering a more iterative and efficient creative workflow.
7.2 The Role of Low-fidelity Output for Creativity Support
The post-interview showed that adopting low-fidelity output can facilitate further imagination beyond what the system provided. We deliberately employ a low-fidelity sketch output in both CreativeConnect and the baseline. During the interview, 12 out of 16 participants preferred the sketch output over a complete image, allowing users room for imagination and interpretation. The image converted into a sketch omits small details and retains only the larger forms, generating a large empty space. This emptiness encourages users not just to perceive the generated image but to see it as room for further development and makes users deeply engaged in further ideation. Some participants even expressed opposition to completed images for the ideation stage, as they believed that an abundance of details in reference images makes them fixated on that specific design idea and hinders them from utilizing the images in their own ideas. P2 said, “I usually get completed artworks from Pinterest 4 as a reference, and I found myself unavoidably looking at the unique style of that designer, wanting to replicate it. This time, I liked that I could maintain my own style while exploring different references of concepts.” Based on our findings, adopting low-fidelity output could be an option when designing creativity support systems to prevent fixation and facilitate the user’s creativity in ideation. For example, a design reference tool can dynamically adjust the levels of details of the provided images based on the user’s current design stage. When the user wants references for overall concepts, the system can convert reference images to a simple black line drawing or even present it solely as a textual description. Conversely, when the user has determined a specific concept and is exploring different visual details, the system can offer the original images with full details.
7.3 Generalizability of CreativeConnect in Different Context
CreativeConnect is designed to support early-stage designers, such as design students, with a general understanding of the design process but need help with reference recombination. However, our user study revealed some insights applicable to different expertise levels. We observed that participants with limited sketching skills were satisfied more with the baseline system, as it was more aligned with their intentions and suitable for the aid for the actual sketching. Therefore, for users less familiar with artistic expression, an AI tool’s output should prioritize alignment with the users’ original intent rather than abstraction. Conversely, for experts accustomed to extracting inspiration from references and combining them into their original idea [4], CreativeConnect could serve as a tool for serendipity rather than helping them with the process of keyword extraction and recombination. For example, P16 said that suggested keywords and merged images acted as a prompt to remind them of some aspects initially overlooked. Therefore, features should be redesigned to encourage reflection and creative exploration, such as highlighting the part of the generated images that were not present in existing references but emerged through our system features.
The user study results showed that CreativeConnect could also be utilized for other design contexts, such as collaborative projects. According to the CSI survey results (Section 6.3.2), participants indicated that CreativeConnect would be significantly helpful for collaborating with other designers. This was because CreativeConnect is designed to follow the sequential steps of leaving keyword notes and merging them, and it keeps track of these processes on the mood board and the merging panel. Therefore, participants said that simply showing CreativeConnect screen could share their creative processes with other designers, making it easier for them to understand each other’s thought processes and quickly reach an agreement on the design direction. One future work direction can be incorporating features of CreativeConnect to collaborative mood board tools [15, 48, 49] and studying the benefits of keyword-based recombination features.
CreativeConnect could also used for other design domains. Our design goals and the feature design of CreativeConnect are primarily tailored to the illustration design task, which is predominantly about conveying design topics through visual subject matters and does not usually include other modalities such as text (common in poster or publication design) or motion & interaction (common in UI/UX and motion graphic design). However, even in other design domains, the recombination process of extracting elements from the reference and recombining them is an effective strategy. To apply the recombination approach to another design domain, we must first identify what elements designers in that domain focus on when looking at references and use those different categories of elements as keywords in the pipeline of CreativeConnect.
8 Limitations and Future Work
Our work has several limitations that future work can address. In our user study, the ideation tasks were conducted for 30 minutes in each condition, which was shorter than the actual design process. Therefore, it was difficult to observe how the behavior changed over a long time. Future work can be done to incorporate CreativeConnect with real-world design projects and see how their behavior patterns differ from lab studies.
Our pipeline generates an image description containing all of the keywords selected by the user as a method of recombination. However, there can be various ways of recombination other than this, such as blending objects or indirectly expressing some keywords through visual details such as colors. Further work can be done on these various recombination methods and how to support them.
As CreativeConnect and baseline both leverage generative AI, including LLM and layout diffusion model, the result may be influenced based on users’ familiarity with AI. Since this study did not explore those dimensions, future research can examine how creativity supporting tools with AI features may have varying effects depending on the user’s knowledge level of AI or prior experiences of using AI.
9 Conclusion
This paper proposed CreativeConnect, a system designed to support graphic designers in the reference recombination process, allowing them to generate novel design ideas. Building on our formative study observations, CreativeConnect assists users in identifying key elements within reference images. It also provides diverse recommendations for relevant keywords and recombination options. Notably, the low-fidelity sketch-based output of CreativeConnect was shown to encourage creativity by enabling further imaginative exploration. Our user study demonstrated that CreativeConnect efficiently supported both steps of finding and recombining elements and helped participants come up with more design ideas and perceive their ideas as more creative than the baseline. While CreativeConnect represents a promising step towards comprehensive recombination support tools for designers, we also suggested an opportunity to expand such systems to address a broader spectrum of design needs and situations.
Acknowledgments
This work was supported by Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No.2021-0-01347, Video Interaction Technologies Using Object-Oriented Video Modeling / No.2019-0-00075, Artificial Intelligence Graduate School Program (KAIST)) We thank all of our study participants and the members of KIXLAB for their insightful discussions and constructive feedback.
A Technical Details
A.1 Prompt: Extracting Keywords from Image Captions
A.2 Prompt: Recommending Relevant Keywords
A.3 Prompt: Generating Recombinations in Text Descriptions
A.4 Prompt: Matching Layout with Objects
A.5 Prompt: Generating Layout based on Image Caption
A.6 Example Outputs from the Technical Pipeline
Figure 9 shows some examples of inputs and outputs for three technical pipelines of CreativeConnect — (1) element extraction pipeline, (2) keyword recommendation pipeline, and (3) recombination generation pipeline. Figure 10 shows more examples from the keyword extraction pipeline.
Figure 9:
Figure 10:
B User Study
B.1 Baseline System Interface
Figure 11 shows the interface of the baseline system used for the user study. The baseline system looks similar to the CreativeConnect. There is no keyword extraction feature in the left panel, but it allows participants to add keyword notes manually. In the center, there is the same interactive mood board with the CreativeConnect, but no keyword suggestion panel exists. The right panel enables users to manually configure the layout and prompts for image generation instead of selecting the keywords to combine. Other features such as mood board interactions (zoom, add/delete images) and saving favorite sketches were provided the same as the CreativeConnect. Participants could employ ChatGPT for various purposes other than this interface.
Figure 11:
B.2 Interview questions
These are the questions used for the semi-structured interview after the two idea generation sessions with baseline and CreativeConnect tools.
(1)
Can you share the idea sketch you think is most creative in each topic, and what was the main source of inspiration for those ideas?
(2)
Comparing the baseline and CreativeConnect, what were the main differences you noticed in the idea generation process?
(3)
In each of the three main stages of idea generation—finding reference elements, exploring ideas, and generating sketches—did you find one tool more helpful than the other, and why?
(4)
Were there any differences in your typical approach to idea generation when using these tools? If so, how was it different from the usual work process?
(5)
Which functionalities were most beneficial in both tools, and in what scenarios were they particularly useful?
(6)
Were there any situations or specific sketches where the tools were especially useful or not useful?
(7)
In terms of image generation methods, what were the main differences between baseline and CreativeConnect, and when did you feel each method was more helpful?
(8)
How did you feel about the output in sketch format, and do you think the tool’s effectiveness would differ if outputs were presented as a completed image rather than a sketch?
(9)
How did you incorporate the generated images into your final idea sketch?
B.3 Additional User Study Results: Raw Usage Log
Figure 12 shows the full usage log for all 16 user study participants, showing the timestamps of 3 types of user actions (adding keyword notes, generating images, and completing a design idea sketch).
Steven Bird, Ewan Klein, and Edward Loper. 2009. Natural language processing with Python: analyzing text with the natural language toolkit. " O’Reilly Media, Inc.".
Margaret A. Boden. 1998. Creativity and artificial intelligence. Artificial Intelligence 103, 1 (1998), 347–356. https://doi.org/10.1016/S0004-3702(98)00055-1
Nathalie Bonnardel. 1999. Creativity in design activities: The role of analogies in a constrained cognitive environment. In Proceedings of the 3rd conference on Creativity & cognition. 158–165.
Nathalie Bonnardel and Evelyne Marmèche. 2005. Towards supporting evocation processes in creative design: A cognitive approach. International Journal of Human-Computer Studies 63, 4 (2005), 422–435. https://doi.org/10.1016/j.ijhcs.2005.04.006 Computer support for creativity.
Stephen Brade, Bryan Wang, Mauricio Sousa, Sageev Oore, and Tovi Grossman. 2023. Promptify: Text-to-Image Generation through Interactive Prompt Exploration with Large Language Models. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (San Francisco, CA, USA) (UIST ’23). Association for Computing Machinery, New York, NY, USA, Article 96, 14 pages. https://doi.org/10.1145/3586183.3606725
Tim Brooks, Aleksander Holynski, and Alexei A. Efros. 2023. InstructPix2Pix: Learning to Follow Image Editing Instructions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18392–18402.
Donald T Campbell. 1960. Blind variation and selective retentions in creative thought as in other knowledge processes.Psychological review 67, 6 (1960), 380.
Tracy Cassidy. 2011. The Mood Board Process Modeled and Understood as a Qualitative Design Research Tool. Fashion Practice 3, 2 (2011), 225–251. https://doi.org/10.2752/175693811X13080607764854 arXiv:https://doi.org/10.2752/175693811X13080607764854
Joel Chan, Steven Dang, and Steven P. Dow. 2016. Comparing Different Sensemaking Approaches for Large-Scale Ideation. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’16). Association for Computing Machinery, New York, NY, USA, 2717–2728. https://doi.org/10.1145/2858036.2858178
Erin Cherry and Celine Latulipe. 2014. Quantifying the Creativity Support of Digital Tools through the Creativity Support Index. ACM Trans. Comput.-Hum. Interact. 21, 4, Article 21 (jun 2014), 25 pages. https://doi.org/10.1145/2617588
Lydia B. Chilton, Savvas Petridis, and Maneesh Agrawala. 2019. VisiBlends: A Flexible Workflow for Visual Blends. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3290605.3300402
Orestes Chouchoulas and A.K. Day. 2007. Design Exploration Using A Shape Grammar With A Genetic Algorithm. Open House International 32 (06 2007), 26–35. https://doi.org/10.1108/OHI-02-2007-B0004
John Joon Young Chung and Eytan Adar. 2023. Artinter: AI-Powered Boundary Objects for Commissioning Visual Arts. In Proceedings of the 2023 ACM Designing Interactive Systems Conference (Pittsburgh, PA, USA) (DIS ’23). Association for Computing Machinery, New York, NY, USA, 1997–2018. https://doi.org/10.1145/3563657.3595961
John Joon Young Chung and Eytan Adar. 2023. PromptPaint: Steering Text-to-Image Generation Through Paint Medium-like Interactions. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (, San Francisco, CA, USA,) (UIST ’23). Association for Computing Machinery, New York, NY, USA, Article 6, 17 pages. https://doi.org/10.1145/3586183.3606777
John Joon Young Chung, Shiqing He, and Eytan Adar. 2021. The Intersection of Users, Roles, Interactions, and Technologies in Creativity Support Tools. In Proceedings of the 2021 ACM Designing Interactive Systems Conference (Virtual Event, USA) (DIS ’21). Association for Computing Machinery, New York, NY, USA, 1817–1833. https://doi.org/10.1145/3461778.3462050
John Joon Young Chung, Shiqing He, and Eytan Adar. 2022. Artist Support Networks: Implications for Future Creativity Support Tools. In Proceedings of the 2022 ACM Designing Interactive Systems Conference (Virtual Event, Australia) (DIS ’22). Association for Computing Machinery, New York, NY, USA, 232–246. https://doi.org/10.1145/3532106.3533505
Jennifer Fernquist, Tovi Grossman, and George Fitzmaurice. 2011. Sketch-sketch revolution: an engaging tutorial system for guided sketching and application learning. In Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology (Santa Barbara, California, USA) (UIST ’11). Association for Computing Machinery, New York, NY, USA, 373–382. https://doi.org/10.1145/2047196.2047245
Jonas Frich, Michael Mose Biskjaer, Lindsay MacDonald Vermeulen, Christian Remy, and Peter Dalsgaard. 2019. Strategies in Creative Professionals’ Use of Digital Tools Across Domains. In Proceedings of the 2019 Conference on Creativity and Cognition (San Diego, CA, USA) (C&C ’19). Association for Computing Machinery, New York, NY, USA, 210–221. https://doi.org/10.1145/3325480.3325494
Jonas Frich, Lindsay MacDonald Vermeulen, Christian Remy, Michael Mose Biskjaer, and Peter Dalsgaard. 2019. Mapping the Landscape of Creativity Support Tools in HCI. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–18. https://doi.org/10.1145/3290605.3300619
Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H. Bermano, Gal Chechik, and Daniel Cohen-Or. 2022. An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion. arxiv:2208.01618 [cs.CV]
Steve Garner and Deana McDonagh-Philp. 2001. Problem interpretation and resolution via visual stimuli: the use of ‘mood boards’ in design education. Journal of Art & Design Education 20, 1 (2001), 57–64.
Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2016. Image Style Transfer Using Convolutional Neural Networks. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2414–2423. https://doi.org/10.1109/CVPR.2016.265
Sandra G. Hart and Lowell E. Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research. In Human Mental Workload, Peter A. Hancock and Najmedin Meshkati (Eds.). Advances in Psychology, Vol. 52. North-Holland, 139–183. https://doi.org/10.1016/S0166-4115(08)62386-9
Scarlett R. Herring, Chia-Chen Chang, Jesse Krantzler, and Brian P. Bailey. 2009. Getting Inspired! Understanding How and Why Examples Are Used in Creative Design Practice. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Boston, MA, USA) (CHI ’09). Association for Computing Machinery, New York, NY, USA, 87–96. https://doi.org/10.1145/1518701.1518717
Josh Holinaty, Alec Jacobson, and Fanny Chevalier. 2021. Supporting Reference Imagery for Digital Drawing. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops. 2434–2442.
Tom Hope, Ronen Tamari, Daniel Hershcovich, Hyeonsu B Kang, Joel Chan, Aniket Kittur, and Dafna Shahaf. 2022. Scaling Creative Inspiration with Fine-Grained Functional Aspects of Ideas. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 12, 15 pages. https://doi.org/10.1145/3491102.3517434
Alexander Ivanov, David Ledo, Tovi Grossman, George Fitzmaurice, and Fraser Anderson. 2022. MoodCubes: Immersive Spaces for Collecting, Discovering and Envisioning Inspiration Materials. In Proceedings of the 2022 ACM Designing Interactive Systems Conference (Virtual Event, Australia) (DIS ’22). Association for Computing Machinery, New York, NY, USA, 189–203. https://doi.org/10.1145/3532106.3533565
Youngseung Jeon, Seungwan Jin, Patrick C. Shih, and Kyungsik Han. 2021. FashionQ: An AI-Driven Creativity Support Tool for Facilitating Ideation in Fashion Design. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 576, 18 pages. https://doi.org/10.1145/3411764.3445093
Youwen Kang, Zhida Sun, Sitong Wang, Zeyu Huang, Ziming Wu, and Xiaojuan Ma. 2021. MetaMap: Supporting Visual Metaphor Ideation through Multi-Dimensional Example-Based Exploration. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 427, 15 pages. https://doi.org/10.1145/3411764.3445325
Pegah Karimi, Nicholas Davis, Mary Lou Maher, Kazjon Grace, and Lina Lee. 2019. Relating Cognitive Models of Design Creativity to the Similarity of Sketches Generated by an AI Partner. In Proceedings of the 2019 Conference on Creativity and Cognition (San Diego, CA, USA) (C&C ’19). Association for Computing Machinery, New York, NY, USA, 259–270. https://doi.org/10.1145/3325480.3325488
Pegah Karimi, Jeba Rezwana, Safat Siddiqui, Mary Lou Maher, and Nasrin Dehbozorgi. 2020. Creative Sketching Partner: An Analysis of Human-AI Co-Creativity. In Proceedings of the 25th International Conference on Intelligent User Interfaces (Cagliari, Italy) (IUI ’20). Association for Computing Machinery, New York, NY, USA, 221–230. https://doi.org/10.1145/3377325.3377522
Tero Karras, Samuli Laine, and Timo Aila. 2019. A Style-Based Generator Architecture for Generative Adversarial Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and Improving the Image Quality of StyleGAN. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
Kevin Gonyop Kim, Richard Lee Davis, Alessia Eletta Coppi, Alberto Cattaneo, and Pierre Dillenbourg. 2022. Mixplorer: Scaffolding Design Space Exploration through Genetic Recombination of Multiple Peoples’ Designs to Support Novices’ Creativity. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 308, 13 pages. https://doi.org/10.1145/3491102.3501854
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, and Ross Girshick. 2023. Segment Anything. arxiv:2304.02643 [cs.CV]
Hyung-Kwon Ko, Gwanmo Park, Hyeon Jeon, Jaemin Jo, Juho Kim, and Jinwook Seo. 2023. Large-Scale Text-to-Image Generation Models for Visual Artists’ Creative Works. In Proceedings of the 28th International Conference on Intelligent User Interfaces (Sydney, NSW, Australia) (IUI ’23). Association for Computing Machinery, New York, NY, USA, 919–933. https://doi.org/10.1145/3581641.3584078
Janin Koch, Andrés Lucero, Lena Hegemann, and Antti Oulasvirta. 2019. May AI? Design Ideation with Cooperative Contextual Bandits. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300863
Janin Koch, Nicolas Taffin, Andrés Lucero, and Wendy E. Mackay. 2020. SemanticCollage: Enriching Digital Mood Board Design with Semantic Labels. In Proceedings of the 2020 ACM Designing Interactive Systems Conference (Eindhoven, Netherlands) (DIS ’20). Association for Computing Machinery, New York, NY, USA, 407–418. https://doi.org/10.1145/3357236.3395494
Tomas Lawton, Kazjon Grace, and Francisco J Ibarrola. 2023. When is a Tool a Tool? User Perceptions of System Agency in Human–AI Co-Creative Drawing. In Proceedings of the 2023 ACM Designing Interactive Systems Conference (Pittsburgh, PA, USA) (DIS ’23). Association for Computing Machinery, New York, NY, USA, 1978–1996. https://doi.org/10.1145/3563657.3595977
Brian Lee, Savil Srivastava, Ranjitha Kumar, Ronen Brafman, and Scott R. Klemmer. 2010. Designing with Interactive Example Galleries. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Atlanta, Georgia, USA) (CHI ’10). Association for Computing Machinery, New York, NY, USA, 2257–2266. https://doi.org/10.1145/1753326.1753667
Junnan Li, Dongxu Li, Silvio Savarese, and Steven Hoi. 2023. BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models. arxiv:2301.12597 [cs.CV]
Long Lian, Boyi Li, Adam Yala, and Trevor Darrell. 2023. LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models. arxiv:2305.13655 [cs.CV]
Vivian Liu, Jo Vermeulen, George Fitzmaurice, and Justin Matejka. 2023. 3DALL-E: Integrating Text-to-Image AI in 3D Design Workflows. In Proceedings of the 2023 ACM Designing Interactive Systems Conference (Pittsburgh, PA, USA) (DIS ’23). Association for Computing Machinery, New York, NY, USA, 1955–1977. https://doi.org/10.1145/3563657.3596098
Andrés Lucero. 2015. Funky-Design-Spaces: Interactive Environments for Creativity Inspired by Observing Designers Making Mood Boards. In Human-Computer Interaction – INTERACT 2015, Julio Abascal, Simone Barbosa, Mirko Fetter, Tom Gross, Philippe Palanque, and Marco Winckler (Eds.). Springer International Publishing, Cham, 474–492.
Andrés Lucero, Dzmitry Aliakseyeu, Kees Overbeeke, and Jean-Bernard Martens. 2009. An Interactive Support Tool to Convey the Intended Message in Asynchronous Presentations. In Proceedings of the International Conference on Advances in Computer Entertainment Technology (Athens, Greece) (ACE ’09). Association for Computing Machinery, New York, NY, USA, 11–18. https://doi.org/10.1145/1690388.1690391
Justin Matejka, Michael Glueck, Erin Bradner, Ali Hashemi, Tovi Grossman, and George Fitzmaurice. 2018. Dream Lens: Exploration and Visualization of Large-Scale Generative Design Datasets. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (Montreal QC, Canada) (CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3173574.3173943
Felix Müller-Wienbergen, Oliver Müller, Stefan Seidel, and Jörg Becker. 2011. Leaving the beaten tracks in creative work–A design theory for systems that support convergent and divergent thinking. Journal of the Association for Information Systems 12, 11 (2011), 2.
Michael D Mumford, Michele I Mobley, Roni Reiter-Palmon, Charles E Uhlman, and Lesli M Doares. 1991. Process analytic models of creative capacities. Creativity research journal 4, 2 (1991), 91–122. https://doi.org/10.1080/10400419209534428 arXiv:https://doi.org/10.1080/10400419209534428
Alexander Quinn Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob Mcgrew, Ilya Sutskever, and Mark Chen. 2022. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models. In Proceedings of the 39th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 162), Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato (Eds.). PMLR, 16784–16804. https://proceedings.mlr.press/v162/nichol22a.html
Changhoon Oh, Jungwoo Song, Jinhan Choi, Seonghyeon Kim, Sungwoo Lee, and Bongwon Suh. 2018. I Lead, You Help but Only with Enough Details: Understanding User Experience of Co-Creation with Artificial Intelligence. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (Montreal QC, Canada) (CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3173574.3174223
Marcin L. Pilat and Christian Jacob. 2008. Creature Academy: A system for virtual creature evolution. In 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence). 3289–3297. https://doi.org/10.1109/CEC.2008.4631243
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 8748–8763. https://proceedings.mlr.press/v139/radford21a.html
Daniel Ritchie, Ankita Arvind Kejriwal, and Scott R. Klemmer. 2011. D.Tour: Style-Based Exploration of Design Example Galleries. In Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology (Santa Barbara, California, USA) (UIST ’11). Association for Computing Machinery, New York, NY, USA, 165–174. https://doi.org/10.1145/2047196.2047216
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-Resolution Image Synthesis With Latent Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 10684–10695.
Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman. 2023. DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 22500–22510.
Pao Siangliulue, Joel Chan, Krzysztof Z. Gajos, and Steven P. Dow. 2015. Providing Timely Examples Improves the Quantity and Quality of Generated Ideas. In Proceedings of the 2015 ACM SIGCHI Conference on Creativity and Cognition (Glasgow, United Kingdom) (C&C ’15). Association for Computing Machinery, New York, NY, USA, 83–92. https://doi.org/10.1145/2757226.2757230
Dean Keith Simonton. 2003. Scientific creativity as constrained stochastic behavior: the integration of product, person, and process perspectives.Psychological bulletin 129, 4 (2003), 475.
Evgeny Stemasov, David Ledo, George Fitzmaurice, and Fraser Anderson. 2023. Immersive Sampling: Exploring Sampling for Future Creative Practices in Media-Rich, Immersive Spaces. In Proceedings of the 2023 ACM Designing Interactive Systems Conference (Pittsburgh, PA, USA) (DIS ’23). Association for Computing Machinery, New York, NY, USA, 212–229. https://doi.org/10.1145/3563657.3596131
Kim Sung-Bin, Arda Senocak, Hyunwoo Ha, Andrew Owens, and Tae-Hyun Oh. 2023. Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 6430–6440.
Michela Turrin, Peter von Buelow, and Rudi Stouffs. 2011. Design explorations of performance driven geometry in architectural design using parametric modeling and genetic algorithms. Advanced Engineering Informatics 25, 4 (2011), 656–675. https://doi.org/10.1016/j.aei.2011.07.009 Special Section: Advances and Challenges in Computing in Civil and Building Engineering.
Sitong Wang, Savvas Petridis, Taeahn Kwon, Xiaojuan Ma, and Lydia B Chilton. 2023. PopBlends: Strategies for Conceptual Blending with Large Language Models. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 435, 19 pages. https://doi.org/10.1145/3544548.3580948
Wenhui Wang, Furu Wei, Li Dong, Hangbo Bao, Nan Yang, and Ming Zhou. 2020. MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 5776–5788. https://proceedings.neurips.cc/paper_files/paper/2020/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
Tongshuang Wu, Michael Terry, and Carrie Jun Cai. 2022. AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 385, 22 pages. https://doi.org/10.1145/3491102.3517582
Lixiu Yu and Jeffrey V. Nickerson. 2011. Cooks or Cobblers? Crowd Creativity through Combination. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Vancouver, BC, Canada) (CHI ’11). Association for Computing Machinery, New York, NY, USA, 1393–1402. https://doi.org/10.1145/1978942.1979147
Loutfouz Zaman, Wolfgang Stuerzlinger, Christian Neugebauer, Rob Woodbury, Maher Elkhaldi, Naghmi Shireen, and Michael Terry. 2015. GEM-NI: A System for Creating and Managing Alternatives In Generative Design. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (Seoul, Republic of Korea) (CHI ’15). Association for Computing Machinery, New York, NY, USA, 1201–1210. https://doi.org/10.1145/2702123.2702398
Enhao Zhang and Nikola Banovic. 2021. Method for Exploring Generative Adversarial Networks (GANs) via Automatically Generated Image Galleries. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 76, 15 pages. https://doi.org/10.1145/3411764.3445714
Nanxuan Zhao, Nam Wook Kim, Laura Mariah Herman, Hanspeter Pfister, Rynson W.H. Lau, Jose Echevarria, and Zoya Bylinskii. 2020. ICONATE: Automatic Compound Icon Generation and Ideation. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376618
Guangcong Zheng, Xianpan Zhou, Xuewei Li, Zhongang Qi, Ying Shan, and Xi Li. 2023. LayoutDiffusion: Controllable Diffusion Model for Layout-to-Image Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 22490–22499.
Hai-Jew S(2024)Multimodal Outputs for the Workplace From Generative AIComputational Practices and Applications for Digital Art and Crafting10.4018/979-8-3693-2927-6.ch008(198-225)Online publication date: 17-Jul-2024
Escudero Fernández S(2024)Inteligencia Artificial en Illustrator. Percepción y exploración en estudiantes de diseño gráficoInteligencia Artificial en Illustrator. Percepción y exploración en estudiantes de diseño gráficoEuropean Public & Social Innovation Review10.31637/epsir-2024-6709(1-19)Online publication date: 12-Sep-2024
Chen CNguyen CGroueix TKim VWeibel N(2024)MemoVis: A GenAI-Powered Tool for Creating Companion Reference Images for 3D Design FeedbackACM Transactions on Computer-Human Interaction10.1145/369468131:5(1-41)Online publication date: 4-Sep-2024
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
Crowdsourcing can collect many diverse ideas by prompting ideators individually, but this can generate redundant ideas. Prior methods reduce redundancy by presenting peers’ ideas or peer-proposed prompts, but these require much human coordination. We ...
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
Visual metaphors, which are widely used in graphic design, can deliver messages in creative ways by fusing different objects. The keys to creating visual metaphors are diverse exploration and creative combinations, which is challenging with conventional ...
UIST '24: Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology
Rapid creation of novel product appearance designs that align with consumer emotional requirements poses a significant challenge. Text-to-image models, with their excellent image generation capabilities, have demonstrated potential in providing ...
Hai-Jew S(2024)Multimodal Outputs for the Workplace From Generative AIComputational Practices and Applications for Digital Art and Crafting10.4018/979-8-3693-2927-6.ch008(198-225)Online publication date: 17-Jul-2024
Escudero Fernández S(2024)Inteligencia Artificial en Illustrator. Percepción y exploración en estudiantes de diseño gráficoInteligencia Artificial en Illustrator. Percepción y exploración en estudiantes de diseño gráficoEuropean Public & Social Innovation Review10.31637/epsir-2024-6709(1-19)Online publication date: 12-Sep-2024
Chen CNguyen CGroueix TKim VWeibel N(2024)MemoVis: A GenAI-Powered Tool for Creating Companion Reference Images for 3D Design FeedbackACM Transactions on Computer-Human Interaction10.1145/369468131:5(1-41)Online publication date: 4-Sep-2024
Ganguly AYan CChung JSun TKiheon YGingold YHong S(2024)ShadowMagic: Designing Human-AI Collaborative Support for Comic Professionals’ ShadowingProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676332(1-15)Online publication date: 13-Oct-2024
Long TGero KChilton L(2024)Not Just Novelty: A Longitudinal Study on Utility and Customization of an AI WorkflowProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661587(782-803)Online publication date: 1-Jul-2024
Li HXue TZhang ALuo XKong LHuang G(2024)The application and impact of artificial intelligence technology in graphic design: A critical interpretive synthesisHeliyon10.1016/j.heliyon.2024.e4003710:21(e40037)Online publication date: Nov-2024