ChatGPT Is A Blurry JPEG of The Web - The New Yorker
ChatGPT Is A Blurry JPEG of The Web - The New Yorker
ChatGPT Is A Blurry JPEG of The Web - The New Yorker
ChatGPT Is a
Blurry JPEG of
the Web
OpenAI’s chatbot offers paraphrases, whereas
Google offers quotes. Which do we prefer?
By Ted Chiang
February 9, 2023
n 2013, workers at a German construction company noticed
I something odd about their Xerox photocopier: when they made
a copy of the !oor plan of a house, the copy differed from the
original in a subtle but signi#cant way. In the original !oor plan,
each of the house’s three rooms was accompanied by a rectangle
specifying its area: the rooms were 14.13, 21.11, and 17.42 square
metres, respectively. However, in the photocopy, all three rooms
were labelled as being 14.13 square metres in size. The company
contacted the computer scientist David Kriesel to investigate this
seemingly inconceivable result. They needed a computer scientist
because a modern Xerox photocopier doesn’t use the physical
xerographic process popularized in the nineteen-sixties. Instead, it
scans the document digitally, and then prints the resulting image
#le. Combine that with the fact that virtually every digital image
#le is compressed to save space, and a solution to the mystery
begins to suggest itself.
Now, losing your Internet access isn’t quite so terrible; you’ve got all
the information on the Web stored on your server. The only catch
is that, because the text has been so highly compressed, you can’t
look for information by searching for an exact quote; you’ll never
get an exact match, because the words aren’t what’s being stored. To
solve this problem, you create an interface that accepts queries in
the form of questions and responds with answers that convey the
gist of what you have on your server.
What I’ve described sounds a lot like ChatGPT, or most any other
large language model. Think of ChatGPT as a blurry jpeg of all
the text on the Web. It retains much of the information on the
Web, in the same way that a jpeg retains much of the information
of a higher-resolution image, but, if you’re looking for an exact
sequence of bits, you won’t #nd it; all you will ever get is an
approximation. But, because the approximation is presented in the
form of grammatical text, which ChatGPT excels at creating, it’s
usually acceptable. You’re still looking at a blurry jpeg, but the
blurriness occurs in a way that doesn’t make the picture as a whole
look less sharp.
So let’s assume that we’re not talking about a new genre of writing
that’s analogous to Xerox art. Given that stipulation, can the text
generated by large language models be a useful starting point for
writers to build off when writing something original, whether it’s
#ction or non#ction? Will letting a large language model handle
the boilerplate allow writers to focus their attention on the really
creative parts?
Obviously, no one can speak for all writers, but let me make the
argument that starting with a blurry copy of unoriginal work isn’t a
good way to create original work. If you’re a writer, you will write a
lot of unoriginal work before you write something original. And the
time and effort expended on that unoriginal work isn’t wasted; on
the contrary, I would suggest that it is precisely what enables you to
eventually create something original. The hours spent choosing the
right word and rearranging sentences to better follow one another
are what teach you how meaning is conveyed by prose. Having
students write essays isn’t merely a way to test their grasp of the
material; it gives them experience in articulating their thoughts. If
students never have to write essays that we have all read before,
they will never gain the skills needed to write something that we
have never read.
And it’s not the case that, once you have ceased to be a student, you
can safely use the template that a large language model provides.
The struggle to express your thoughts doesn’t disappear once you
graduate—it can take place every time you start drafting a new
piece. Sometimes it’s only in the process of writing that you
discover your original ideas. Some might say that the output of
large language models doesn’t look all that different from a human
writer’s #rst draft, but, again, I think this is a super#cial
resemblance. Your #rst draft isn’t an unoriginal idea expressed
clearly; it’s an original idea expressed poorly, and it is accompanied
by your amorphous dissatisfaction, your awareness of the distance
between what it says and what you want it to say. That’s what
directs you during rewriting, and that’s one of the things lacking
when you start with text generated by an A.I.
Algorithms Technology
ChatGPT
Daily
Our !agship newsletter
highlights the best of
The New Yorker,
including top stories,
#ction, humor, and
podcasts.
E-mail
address
E-mail address
Sign up