GPT-4: A Viral Case of AI Misinformation
For those of you wondering, GPT-4 won't have 100 trillion parameters
I'm responsible for the “GPT-4 will have 100 trillion parameters” false statement going viral on social media.
In case you don’t know what I’m talking about, here are a couple of visual examples:
These two images above were shared on Twitter (together, they’ve got 5 million views). Similar posts are circulating on LinkedIn, Reddit, and other sites. All slightly different versions of the same thing.
They combine an appealing visual graph of a GPT-3/GPT-4 comparison (recent ones use GPT-3 as a proxy for ChatGPT) and an accompanying hook with a high emotional charge (e.g. “this is a frightening visual for me” or “[GPT-4] will make ChatGPT look like a toy”). There’s also an authoritative touch to them—more often than not the posters show no trace of doubt in their claims and sources are lacking.
The millions of people who first learn about GPT-4 through these posts leave irremediably hyped (and surprised or afraid) for what’s to come. The problem is twofold: They won’t access the reality behind the hype due to the illusory certainty of fake knowledge and, sooner or later, they’ll find out their expectations aren’t going to be fulfilled (i.e. GPT-4 won’t be as amazing as those visuals suggest).
If we take this problem to the extreme we get the perfect recipe for an AI winter. As I’ve argued recently, I think it’s not going to happen, but the risk increases with this kind of viral misinformation.
This (personal) essay is a cautionary tale on how fast—and how deep—misinfo spreads through the internet and how it can emerge even from the best intentions. It’s also an apology on my side and a self-reflection on how publicly-shared knowledge can have an unpredicted (negative) impact downstream.
Here’s why I feel responsible (at least in part).
How misinformation emerges from information
“A lie can travel halfway around the world while the truth is putting on its shoes.”
- Commonly attributed to Mark Twain
I don’t remember when I came across the first GPT-4 visual, but it was much earlier than ChatGPT was released. ChatGPT’s virality has contributed greatly to the spread of the fake GPT-4 prediction—it has prepared the breeding ground for it. People love talking about hot topics (AI has never been hotter than it’s now) and the people who post these (obviously) attractive graphs have a strong incentive to do so due to the potential gains.
But I do recall very well the first public source that mentioned the 100T figure: it was this article by Will Knight on Wired, published on August 2021. I remember it vividly because as soon as I read it I knew it was a banger—it led me to the piece that makes me responsible: “GPT-4 Will Have 100 Trillion Parameters—500x the Size of GPT-3,” which I published on September 2021. (Don’t go read it, it’s outdated!)
In his article, Knight mentions a conversation that Andrew Feldman, CEO of AI hardware startup Cerebras Systems, had had with OpenAI. I quoted his claim:
Andrew Feldman, Cerebras’ CEO said to Wired: “From talking to OpenAI, GPT-4 will be about 100 trillion parameters. […] That won’t be ready for several years.”
My article reached 100K views on Medium (mostly from Google search). That’s nothing compared to those Linkedin posts with millions of views, but at the time it was the only source, besides Wired. That makes me guilty for the very existence of the GPT-4 misleading images, at least in part—other people eventually echoed Feldman’s bold claim, too. I just happened to do it first.
At the time it didn’t feel like a mistake, but like a scoop (kind of). I published the article more than a year ago when it was the newest info on GPT-4. Unfortunately, I hadn't yet polished my principles on how to express information that's hypothetical or speculative. Although Feldman expressed it as a given, I couldn't assess the source's veracity—I took it at face value and shared it with my audience.
Now I see the mistake.
In hindsight, a few details in Feldman’s claim should have sounded my alarm. Not in the sense of it being false but in making me take it with a grain of salt and frame it more—not less—conservatively (which I didn’t, helping spread the hype).
First, Feldman says “GPT-4 will be about [emphasis mine] 100 trillion parameters.” GPT-4 didn’t have a fixed number of parameters (i.e. GPT-4 wasn’t yet built). Despite this, I framed the news as “GPT-3 will have 100T params [emphasis mine],” which makes the statement firmer—and invites people to take it as a set truth.
But there’s a more telling detail. Feldman said: “That won’t be ready for several years [emphasis mine].” Although I added his claim verbatim, I also wrote afterward: “…[GPT-4] will come out in a few years… [emphasis mine].” Two things here. GPT-4 is rumored to be released in early- or maybe mid-2023. No one would say 1.5 years are “several” years. It’s possible Feldman wasn’t referring to the real GPT-4 but to a subsequent version (even if he explicitly said “GPT-4”). I dismissed this possibility. Also, from his “several years” to my “few years” there’s already a slight exaggeration—again, adding to the hype.
How misinformation spreads like an uncontrolled fire
New information eventually came out. Sam Altman denied the 100T rumor in the AC10 online meetup Q&A soon after I published the infamous article but asked explicitly to not reveal info from the talk (which, of course, didn’t happen). But I waited until April 2022 and only then I wrote another piece on GPT-4 (now at ~200K views), debunking the earlier 100T number I had helped spread: “One thing [Sam Altman] said for sure is that GPT-4 won’t have 100T parameters.”
Because of the new info, I added a disclaimer to my original GPT-4 article: “IMPORTANT: This article is outdated. Here’s the newest information.” It now links to this other article I published in Nov 2022 (yes, three in total): “GPT-4 Rumors From Silicon Valley,” which many of you have probably read. It explains everything—not just the new info (which I framed as “rumors”) but also re-debunks the long-outdated “GPT-4 = 100T” bit.
Not much later, I began to see discussions on GPT-4 on social media. Some argued it was confirmed to be huge whereas others said the number had been debunked and GPT-4 would be around GPT-3’s size. And then the questionable—even dishonest, given the open availability of the updates—visuals started to appear.
Not a month later ChatGPT came out. To my surprise, the GPT-3/GPT-4 visuals went much more viral than ever before when it was clearest that the info was wrong only because the incentives to talk about the hottest topic (AI) were so strong. I was convinced people would have followed the breadcrumbs to the original source, found my disclaimer, and gone directly to the newest, updated info on the topic. But few people did (or maybe they did but decided the 100T claim was spicier).
We rarely care about details in the information we consume—I’m not implying people should doubt the 100T number because it’s surprisingly high (only AI-savvy people will realize that). I’m saying we’re lazy in how we grow our database of prior knowledge from which we then build our world models.
This kind of inaction helps spread misinformation as much as those who inadvertently create it (like me in this case). That’s why I’m now correcting anyone who makes this claim, explaining why it’s wrong (others are doing the same).
Yet, it doesn’t seem to be enough after misinformation has spread so far (that’s what motivated this article). What I wanted to be “breaking news” on GPT-4, ended up being a bad example of the telephone game.
A final personal reflection on my role in hyping up AI
Here’s a comment on the “GPT-4 Rumors From Silicon Valley” Medium post that made me think about my role in hyping up AI (the comment is longer and doesn’t refer to the 100T misinfo, but it raises a relevant point):
“What's best in your analyses is that you generally get around to debunking the hype. What's weakest (especially here) is that you spend a lot of time building up the hype before deflating it.”
Not that I fully agree with the implicit accusation, but it’s worth a reflection on my part. Here’s what I think.
First, the best approach I can take as a mix of AI journalist-analyst-blogger is to open my mind to all sides of any debate about where AI is, where it’s going or where it should go. The task of “debunking the hype” is more successful if the language I use reaches as many people as possible.
The commenter (who seems to usually agree with AI ethicists) criticized my observation that the news on GPT-4 was “…for some, frightening.” I can disagree with the reasons people are afraid or excited about AI (as I often do) but I won’t deny their truth. AI ethicists (with which I mostly agree) sometimes alienate those who think differently for this very reason. Even when they're right (not always), they hardly reach beyond people who already agree with them.
Second, talking positively about GPT-4 or ChatGPT is going to hype people up. But I don’t see anything wrong in resorting to attractive topics if I do it with integrity—and I’m convinced I do alright on that front (for the most part...). If I always wrote clickbaity pieces without any depth—as many people do, now that generative AI is so appealing—then I would certainly dislike my content. But, at the same time, if I always turned to boring topics no one cares about or criticized AI’s issues without ever acknowledging the better aspects, no one would want to read my essays.
Finally, I acknowledge my responsibility for the “GPT-4 will have 100T parameters” hype. Nowadays, I do my best to frame the information I share accordingly to the trust I put in my sources and the likelihood of them being true. Still, it’s inevitable that not everyone will like my reporting—and that’s fine with me.
Thanks for this important piece. I'll fess up, I was one of the people spreading this specific "nugget" of information, potentially after first coming across your earlier piece and then seeing it shared so often in infographic form elsewhere.
I have now gone through three of my past posts discussing GPT-4 to update them accordingly!
Also, I noticed you've added "Why Try AI?" to your list of recommendations. I'm truly humbled. Thank you for putting my dilettante-level blog on your radar. I appreciate it!
Refreshing candor and integrity.