Credible without Credit: Domain Experts Assess Generative Language Models

Abstract

Language models have recently broken into the public consciousness with the release of the wildly popular ChatGPT. Commentators have argued that language models could replace search engines, make college essays obsolete, or even write academic research papers. All of these tasks rely on accuracy of specialized information which can be difficult to assess for non-experts. Using 10 domain experts across science and culture, we provide an initial assessment of the coherence, conciseness, accuracy, and sourcing of two language models across 100 expert-written questions. While we find the results are consistently cohesive and concise, we find that they are mixed in their accuracy. These results raise questions of the role language models should play in general-purpose and expert knowledge seeking.

Anthology ID:: 2023.acl-short.37
Volume:: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 427–438
Language:
URL:: https://aclanthology.org/2023.acl-short.37
DOI:: 10.18653/v1/2023.acl-short.37
Bibkey:
Cite (ACL):: Denis Peskoff and Brandon Stewart. 2023. Credible without Credit: Domain Experts Assess Generative Language Models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 427–438, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Credible without Credit: Domain Experts Assess Generative Language Models (Peskoff & Stewart, ACL 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.acl-short.37.pdf
Video:: https://aclanthology.org/2023.acl-short.37.mp4

PDF Cite Search Video