Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleDecember 2024
Customizing Text-to-Image Models with a Single Image Pair
SA '24: SIGGRAPH Asia 2024 Conference PapersArticle No.: 6, Pages 1–13https://doi.org/10.1145/3680528.3687642Art reinterpretation is the practice of creating a variation of a reference work, making a paired artwork that exhibits a distinct artistic style. We ask if such an image pair can be used to customize a generative model to capture the demonstrated ...
- research-articleDecember 2024
Customizing Text-to-Image Diffusion with Object Viewpoint Control
SA '24: SIGGRAPH Asia 2024 Conference PapersArticle No.: 7, Pages 1–13https://doi.org/10.1145/3680528.3687564Model customization introduces new concepts to existing text-to-image models, enabling the generation of these new concepts/objects in novel contexts. However, such methods lack accurate camera view control with respect to the new object, and users must ...
- research-articleNovember 2024
Revolutionizing Visuals: The Role of Generative AI in Modern Image Generation
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 20, Issue 11Article No.: 356, Pages 1–22https://doi.org/10.1145/3689641Traditional multimedia experiences are undergoing a transformation as generative AI integration fosters enhanced creative workflows, streamlines content creation processes, and unlocks the potential for entirely new forms of multimedia storytelling. It ...
- ArticleOctober 2024
Image Distillation for Safe Data Sharing in Histopathology
Medical Image Computing and Computer Assisted Intervention – MICCAI 2024Pages 459–469https://doi.org/10.1007/978-3-031-72117-5_43AbstractHistopathology can help clinicians make accurate diagnoses, determine disease prognosis, and plan appropriate treatment strategies. As deep learning techniques prove successful in the medical domain, the primary challenges become limited data ...
- ArticleNovember 2024
Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas
AbstractDiffusion models have become the State-of-the-Art for text-to-image generation, and increasing research effort has been dedicated to adapting the inference process of pretrained diffusion models to achieve zero-shot capabilities. An example is the ...
-
- ArticleOctober 2024
StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models
AbstractDespite the burst of innovative methods for controlling the diffusion process, effectively controlling image styles in text-to-image generation remains a challenging task. Many adapter-based methods impose image representation conditions on the ...
- ArticleOctober 2024
Few-Shot Defect Image Generation Based on Consistency Modeling
AbstractImage generation can solve insufficient labeled data issues in defect detection. Most defect generation methods are only trained on a single product without considering the consistencies among multiple products, leading to poor quality and ...
- ArticleOctober 2024
SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis
AbstractWe present SegGen, a new data generation approach that pushes the performance boundaries of state-of-the-art image segmentation models. One major bottleneck of previous data synthesis methods for segmentation is the design of “segmentation labeler ...
- ArticleOctober 2024
DreamDiffusion: High-Quality EEG-to-Image Generation with Temporal Masked Signal Modeling and CLIP Alignment
AbstractThis paper introduces DreamDiffusion, a novel method for generating high-quality images directly from brain electroencephalogram (EEG) signals, without the need to translate thoughts into text. DreamDiffusion leverages pre-trained text-to-image ...
- ArticleOctober 2024
OMG: Occlusion-Friendly Personalized Multi-concept Generation in Diffusion Models
AbstractPersonalization is an important topic in text-to-image generation, especially the challenging multi-concept personalization. Current multi-concept methods are struggling with identity preservation, occlusion, and the harmony between foreground and ...
- ArticleOctober 2024
Collaborative Control for Geometry-Conditioned PBR Image Generation
- Shimon Vainer,
- Mark Boss,
- Mathias Parger,
- Konstantin Kutsy,
- Dante De Nigris,
- Ciara Rowles,
- Nicolas Perony,
- Simon Donné
AbstractGraphics pipelines require physically-based rendering (PBR) materials, yet current 3D content generation approaches are built on RGB models. We propose to model the PBR image distribution directly, avoiding photometric inaccuracies in RGB ...
- ArticleOctober 2024
AccDiffusion: An Accurate Method for Higher-Resolution Image Generation
AbstractThis paper attempts to address the object repetition issue in patch-wise higher-resolution image generation. We propose AccDiffusion, an accurate method for patch-wise higher-resolution image generation without training. An in-depth analysis in ...
- research-articleDecember 2024
Feature Fusion for Multi-Condition Controllable Image Generation
BDIOT '24: Proceedings of the 2024 8th International Conference on Big Data and Internet of ThingsPages 77–83https://doi.org/10.1145/3697355.3697368The rapid advancement of AI-generated content (AIGC) has led to methods that produce highly diverse images from text descriptions. However, relying solely on text descriptions often fails to meet the creator's precise needs. To address this issue, ...
- articleSeptember 2024
Advancing Architectural Design Through Generative Adversarial Network Deep Learning Technology
International Journal of Distributed Systems and Technologies (IJDST-IGI), Volume 15, Issue 1Pages 1–15https://doi.org/10.4018/IJDST.353305Recent advancements in deep learning have popularized Generative Adversarial Networks for image generation. This study investigates integrating Generative Adversarial Networks technology into architectural design to empower architects in creating diverse,...
- posterJuly 2024
PictorialAttributes: Depicting Multiple Attributes with Realistic Imaging
SIGGRAPH '24: ACM SIGGRAPH 2024 PostersArticle No.: 69, Pages 1–2https://doi.org/10.1145/3641234.3671072Traditional visualizations often use abstract graphics, limiting understanding and memorability. Existing methods for pictorial visualization are more engaging, but often create disjointed compositions. To address this, we propose PictorialAttributes, a ...
- research-articleJuly 2024
Separate-and-Enhance: Compositional Finetuning for Text-to-Image Diffusion Models
SIGGRAPH '24: ACM SIGGRAPH 2024 Conference PapersArticle No.: 103, Pages 1–10https://doi.org/10.1145/3641519.3657527Despite recent significant strides achieved by diffusion-based Text-to-Image (T2I) models, current systems are still less capable of ensuring decent compositional generation aligned with text prompts, particularly for the multi-object generation. In ...
- research-articleJune 2024
ConceptLab: Creative Concept Generation using VLM-Guided Diffusion Prior Constraints
ACM Transactions on Graphics (TOG), Volume 43, Issue 3Article No.: 34, Pages 1–14https://doi.org/10.1145/3659578Recent text-to-image generative models have enabled us to transform our words into vibrant, captivating imagery. The surge of personalization techniques that has followed has also allowed us to imagine unique concepts in new scenes. However, an intriguing ...
- Work in ProgressMay 2024
Engaging and Entertaining Adolescents in Health Education Using LLM-Generated Fantasy Narrative Games and Virtual Agents
- Ian Steenstra,
- Prasanth Murali,
- Rebecca B. Perkins,
- Natalie Joseph,
- Michael K Paasche-Orlow,
- Timothy Bickmore
CHI EA '24: Extended Abstracts of the CHI Conference on Human Factors in Computing SystemsArticle No.: 126, Pages 1–8https://doi.org/10.1145/3613905.3650983Games have been successfully used to provide engaging health interventions for adolescents. However, translating health education goals into a playable game has historically taken many person-months of effort, involving game designers, scriptwriters, ...
- ArticleApril 2024
MAP-Elites with Transverse Assessment for Multimodal Problems in Creative Domains
Artificial Intelligence in Music, Sound, Art and DesignPages 401–417https://doi.org/10.1007/978-3-031-56992-0_26AbstractThe recent advances in language-based generative models have paved the way for the orchestration of multiple generators of different artefact types (text, image, audio, etc.) into one system. Presently, many open-source pre-trained models combine ...
- research-articleJune 2024
Emerging image generation with flexible control of perceived difficulty
Computer Vision and Image Understanding (CVIU), Volume 240, Issue Chttps://doi.org/10.1016/j.cviu.2023.103919AbstractEmerging images (EI) are two-tone and contain a number of discrete speckles. If certain speckles are appropriately organized together, we will perceive a meaningful object, which reflects the closed-loop information processing of human visual ...
Highlights- Emerging images (EIs) hold significant application value across multiple domains.
- We proposed a novel EI generation framework that promises more user-control over the nature of the EIs.
- Our framework utilizes multiple cognitive ...