Computer Science > Computer Vision and Pattern Recognition

arXiv:2311.04287 (cs)

[Submitted on 7 Nov 2023]

Title:Holistic Evaluation of Text-To-Image Models

Authors:Tony Lee, Michihiro Yasunaga, Chenlin Meng, Yifan Mai, Joon Sung Park, Agrim Gupta, Yunzhi Zhang, Deepak Narayanan, Hannah Benita Teufel, Marco Bellagente, Minguk Kang, Taesung Park, Jure Leskovec, Jun-Yan Zhu, Li Fei-Fei, Jiajun Wu, Stefano Ermon, Percy Liang

View PDF

Abstract:The stunning qualitative improvement of recent text-to-image models has led to their widespread attention and adoption. However, we lack a comprehensive quantitative understanding of their capabilities and risks. To fill this gap, we introduce a new benchmark, Holistic Evaluation of Text-to-Image Models (HEIM). Whereas previous evaluations focus mostly on text-image alignment and image quality, we identify 12 aspects, including text-image alignment, image quality, aesthetics, originality, reasoning, knowledge, bias, toxicity, fairness, robustness, multilinguality, and efficiency. We curate 62 scenarios encompassing these aspects and evaluate 26 state-of-the-art text-to-image models on this benchmark. Our results reveal that no single model excels in all aspects, with different models demonstrating different strengths. We release the generated images and human evaluation results for full transparency at this https URL and the code at this https URL, which is integrated with the HELM codebase.

Comments:	NeurIPS 2023. First three authors contributed equally
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2311.04287 [cs.CV]
	(or arXiv:2311.04287v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2311.04287

Submission history

From: Michihiro Yasunaga [view email]
[v1] Tue, 7 Nov 2023 19:00:56 UTC (10,710 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Holistic Evaluation of Text-To-Image Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Holistic Evaluation of Text-To-Image Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators