Computer Science > Computer Vision and Pattern Recognition

arXiv:2404.05666 (cs)

[Submitted on 8 Apr 2024]

Title:YaART: Yet Another ART Rendering Technology

Abstract:In the rapidly progressing field of generative models, the development of efficient and high-fidelity text-to-image diffusion systems represents a significant frontier. This study introduces YaART, a novel production-grade text-to-image cascaded diffusion model aligned to human preferences using Reinforcement Learning from Human Feedback (RLHF). During the development of YaART, we especially focus on the choices of the model and training dataset sizes, the aspects that were not systematically investigated for text-to-image cascaded diffusion models before. In particular, we comprehensively analyze how these choices affect both the efficiency of the training process and the quality of the generated images, which are highly important in practice. Furthermore, we demonstrate that models trained on smaller datasets of higher-quality images can successfully compete with those trained on larger datasets, establishing a more efficient scenario of diffusion models training. From the quality perspective, YaART is consistently preferred by users over many existing state-of-the-art models.

Comments:	Prompts and additional information are available on the project page, see this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2404.05666 [cs.CV]
	(or arXiv:2404.05666v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2404.05666

Submission history

From: Sergey Kastryulin [view email]
[v1] Mon, 8 Apr 2024 16:51:19 UTC (9,138 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:YaART: Yet Another ART Rendering Technology

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:YaART: Yet Another ART Rendering Technology

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators