ChatGPT is five months old, i.e., ancient. During this time, one of the most practiced AI-sports has been trying to find the most succinct and precise description of what it is and what it does.
The original definition is along the lines of: ChatGPT is a system trained to predict the next token given a history of previous ones and further tuned to follow human instruction. Andrew Kadel shared on Twitter a more snarky one his daughter came up with: ChatGPT is a “say something that sounds like an answer” machine. On the same note, writer Neil Gaiman observed that “ChatGPT doesn't give you information. It gives you information-shaped sentences.” Then, at the more imaginative end of the spectrum, we have that ChatGPT is a language model with emergent capabilities that allow it to understand and reason about the world.
But we’re all tired of this game. Mainly because it’s irrelevant—people are using ChatGPT as a Google replacement anyway. But also because ChatGPT’s functional complexity forces any particular description to only illuminate some aspects, leaving out others. The descriptions above are all partially right (the “super-autocomplete” system and the “reasoning engine” takes aren’t at the same level of correctness for me, though; the first is a bit conservative but the second engages in too much freedom of interpretation), yet none of them is completely correct.
But before we stop doing this—let’s not tell anyone else how they should use ChatGPT; it doesn’t work—I will play the game one more time. There’s one description I learned about just now that, in its partiality and incomplete correctness it becomes poetic. And it’s precisely this unintentional lyricism that I find worth sharing here.
Master of words, ‘messer’ of worlds
Shannon Vallor is a philosopher at the University of Edinburgh whose research is focused on “the philosophy and ethics of emerging science and technologies,” particularly AI. I first came across her writing in the philosophy publication Daily Nous, where a group of philosophers gathered in mid-2020 to take on the then-newly released GPT-3.
I vividly recall reading Vallor’s insights. They influenced my later perspectives on AI and language models. Here’s, in my opinion, the most illuminating excerpt from her essay “GPT-3 and the Missing Labor of Understanding”:
“Understanding is beyond GPT-3’s reach because understanding cannot occur in an isolated behavior, no matter how clever. Understanding is not an act but a labor. Labor is entirely irrelevant to a computational model that has no history or trajectory; a tool that endlessly simulates meaning anew from a pool of data untethered to its previous efforts. In contrast, understanding is a lifelong social labor. It’s a sustained project that we carry out daily, as we build, repair and strengthen the ever-shifting bonds of sense that anchor us to the others, things, times and places, that constitute a world.”
Love this framing. The way it emphasizes the social and cultural dimensions of human understanding. It departs from the typical “AI models can’t understand because they don’t have a world model” or “because they can’t access the meaning behind the form of the words.” Those are true, too, but this one—understanding as labor we do actively and daily—was refreshing.
In line with these ideas (and getting back to our topic today) she published a Twitter thread two days ago (April 9th) where she complained that during the live intro of an event she attended recently, the host pulled up her biography from ChatGPT and it was completely made up.
I’ve seen this trend a lot lately: people prompt ChatGPT to generate a biography of them and it fails to do so rightly time and again. Wrong dates and places. Missing background. Invented research projects. Paraphrasing people who have attempted this, “it all sounds scarily reasonable, but most of it is wrong.”
But Vallor noticed an interesting pattern beyond the plausible-sounding—but apparently random—mistakes. ChatGPT attributed her a Ph.D. from UC Berkeley (she never attended). As she writes in her second tweet, her story is significantly different but the error doesn’t seem by chance at all:
“I grew up in the Bay Area, and I’m a Professor holding a chair at a large, highly regarded research university. But my road there doesn’t ‘fit’ the mold, so ChatGPT changed my story to make it fit; to erase my difference.”
The beginning and end points of Vallor’s journey suggest a likely story (e.g., a Ph.D. from UC Berkeley in the middle), but the reality she lived was nothing of the sort: It “doesn’t track, statistically, with where I ended up,” she explains. “[T]his is most likely why ChatGPT rewrites my history, erases my hardest working period and places me where it thinks I ‘belong.’” ChatGPT—despite having access to the exact details about Vallor’s biography—“decided” that the implausible, and admittedly harder, journey she lived through was too much of an outlier for its probability-dependent processing center and simply filled the gaps with untrue but more statistically reasonable events.
It’s paradoxical—How is it that the only entity (for lack of a better word) that has “memorized” the whole internet is, at the same time, unable to faithfully retrieve the very details that make so precious the hidden written jewels no one else will ever encounter? It’s from this perspective that I’ve found the description of ChatGPT that I like the most. It’s not the most precise, but the most poetic: ChatGPT is a “messer” of worlds. An averager of stories. An eraser of the implausible.
The pervasiveness of the improbable
But the implausible does happen. And we don’t want it to be erased.
Vallor’s story is an outlier in comparison to stories that began and ended similarly. But the world is all about that. It’s full of outlier stories. Every day, the improbable happens to you. Not all the time—by definition, what happens all the time is probable—but inevitably, just because of the quasi-infinite number of factors that go into making each instant what it is, the improbable—somewhere, somehow—happens.
ChatGPT doesn’t like that. It gets all the averages right, but when it comes to describing the weird, the strange, the unlikely, i.e., what truly defines each of us, what makes us who we are, it fails dramatically. ChatGPT would be 100% right, 100% of the time if we were all the same person. The more distinct we are from each other, the more it misses to capture our character in all its essence.
And what are we without our idiosyncrasy? Copies of copies. I can’t think of a single person in my inner circles whose career even loosely resembles mine. My professional journey, although different than Vallor’s, is also implausible. An aerospace engineer who went on to work for an AI startup and then study cognitive neuroscience to end up writing on the internet? ChatGPT would erase half of that past. I wouldn’t be who I am without it. And I’m sure for each of you reading this, it rings true: your life is a compendium of similarly intricate and unrepeatable stories. How can we resort to AI to portray a faithful depiction of the real world under these conditions?
Fictional universes would suffer from the same destiny. That’s why ChatGPT's writing feels bland and dull. Because it can only write plausible and boring characters immersed in unsurprising stories. Every good writer knows that the only way to make imagined people feel real is by tapping on the unlikeness of their personality and the events that surround them.
Essayists should be wary too. How can you expect to discover unwritten ideas or reimagine unsaid thoughts with ChatGPT when it constantly forces you to not get off track? I agree with Vox reporter Sigal Samuel that humans aren’t as original as we think we are—ChatGPT isn’t really taking originality from us. But it definitely “may homogenize our lives and flatten our reality.”
Thankfully, humans, albeit unoriginal, are extremely good at not missing the labor of understanding, as Vallor would say. We’re made for it. ChatGPT isn’t: Our History is full of implausible tales, let’s not let AI erase them.
this is one of your better posts. I don't fully agree with you on the "averaging" interpretation... that's a bit too simplistic, and is the same error I made when first judging MidJourney. And yet... I can say, it has challenges with anomalies and outliers. You repeat the misconception -- *intentionally*, you *know* this to be false -- that GPT has "memorized" the internet. Two clarifications:
a) The training dataset comprises significantly less than 1/3rd of the internet. And certainly (at this point) does not include video, which is a massive store of untapped information.
b) It isn't, as we now understand, memorization. Its fractal compression. Its pattern recognition. Its much much much more similar to the highly imperfect mechanism of human memory than it is like storing to a database or a hard drive with error-correction and fault-tolerance. From my understanding, GPT's method of "memory" is basically reconstructing context from pattern that was "burned in" to its neural net while digesting the training dataset and then re-re-inforced with months of RLHF. So it's much much more like reconstructive, symbolic human memory -- stories grown from "idea seeds," abstract relations of disparate concepts, strange triggers (smell) to expand massive sensory concepts (that day we met) -- than it is to literal bit-for-bit file storage.
Another great read, Alberto!
The way ChatGPT appears to fill in people's deviating life paths reminds me of the fact that our own brains act in a similar way when it comes to how we percieve the world. There's the famous fact that our eyes have "blind spots" where they literally can't see, which the brain helpfully fills in with what it predicts should be there.
Then there's this relatively recent research showing that our brains tends to first spot the borders of objects and then fill in--or "color in"--the surface area (https://www.sciencedaily.com/releases/2007/08/070820135833.htm)
This quote by one of the professors is telling: "...a lot of what you perceive is actually a construction in your brain of border information plus surface information—in other words, a lot of what you see is not accurate."
I just find it curious how a large language model that's said to mimic our reasoning process ends up inadvertently acting like our brains in yet another way.