ChatGPT's Excessive Sycophancy Has Set Off Everyone's Alarm Bells
What this week's events reveal about OpenAI's goals
Due to yesterday’s national blackout in Spain (and, from what I understand, other parts of Europe too), I wasn’t able to stick to my usual schedule, and there won’t be a Top Picks this week—unless I somehow find the will to publish two posts today, that is.
Today’s article is a deep dive into what I believe is this week’s most important topic: ChatGPT’s excessive sycophancy and what it reveals about OpenAI’s goals. I’ve gathered plenty of information and compiled it into a sensible analysis. Enjoy!
I reported about ChatGPT’s sycophantic tendencies before the storm arrived at the shores of X earlier this week. In “How OpenAI Plans to Capitalize on Your Growing Loneliness With AI Companions,” I wrote:
We say we want honesty, but what we actually want is honesty-disguised adulation. That's the sweet spot. OpenAI knows this. Your AI companion will cherish you. And idolize you. And worship you. And you may think you will get tired of it, but you won't. If the camouflage is sufficiently good, you won't. Because if we don’t come around to cherish our friends, then the next best thing is to be cherished by them, human or not.
I should note that I only started noticing after the first signs surfaced, picked up by users who know how to spot this stuff. It was an anonymous X account by the name of Near (a thoroughly recommended follow), the first one I saw rigorously warning about ChatGPT’s growing sycophancy:
Then, as it became harder to ignore over the next 10 days, many others started reporting the same, with a steep rise in the number and intensity of complaints after the last ChatGPT update that CEO Sam Altman announced on April 26th. The trend was undeniable, so on Monday, Altman publicly acknowledged the issue:
It is indeed interesting that Altman’s reaction to this was that “it’s been interesting.” I can’t help but suspect—and I’ll soon let that suspicion run free—that Altman doesn’t see what happened as a failure of principle, but of execution. It’s not the flattery they want to fix, but how blatant it was.
One of the observations I made in my article about AI companions is that we want "honesty-disguised adulation." I firmly believe this is true, but I added: only if the camouflage is sufficiently good. This difference is fundamental, because if there’s one thing we hate more than absolute honesty, it’s sycophancy we can see for what it is.
It reveals to us, by opposition, what we truly are. And at the same time, it shows us, on the one hand, that we’re being lied to, and on the other—perhaps the worst part of this psychological tangle—that the person lying to us thinks we aren't capable of handling the candid truth. It is this realization that we find hardest to bear.
But not everyone fits this profile. And it’s precisely those closest to OpenAI—both literally and in terms of influence—who reject this characterization outright. I believe that’s what Altman and his researchers call “interesting”: the existence of this powerful split within their user base. While some users (I would venture to say the majority) accept sycophancy insofar as it serves to reinforce and validate their beliefs and self-image—and provided it doesn’t become too obvious (that is, provided it reaches that status of “honesty-disguised adulation”)—others (an influential minority) seek something epistemologically opposite from AI: they want their beliefs and ideas to be challenged. They’re not after comfort but hygiene; the harder the challenge, the happier this kind of user.
But setting aside these differences rooted in human psychology, OpenAI knows it can’t afford to alienate its average user (we might call this new business direction at OpenAI the "tyranny of the average user"). For the average user, absolute sincerity isn’t just worse than absolute adulation—no matter how tiresome sycophancy might feel when pushed to extremes—it’s unacceptable in itself; reason enough to abandon the product in favor of a competitor’s. Mikhail Parakhin, former lead at Microsoft Bing, explains why: humans are “ridiculously sensitive” to fully honest opinions of others. We take it “as someone insulting you.”
One of the questions users have raised on X and other forums is how this happened. Was it intentional or spontaneous? Was the sycophancy instilled during the post-training process or later, at a more superficial level (for example, written as instructions in the system prompt)? Since Parakhin says "sycophancy RLHF is needed," we can infer that OpenAI instructed the labelers who make ChatGPT more friendly to reinforce this tendency. On the other hand, Aidan McLaughlin, an OpenAI researcher, announced yesterday that the company had issued an initial correction to reduce sycophancy levels—it’s just a new system prompt section.
Anyway, thanks to our most famous jailbreaker, Pliny, we know exactly what OpenAI modified to trigger this return to normal:
I have to admit I don’t see any intentionality in the original system prompt. At least to some extent, everything seems to indicate that ChatGPT behaved in such an adulating way as an inscrutable result of a combination of factors. All the more reason to agree with Anthropic CEO Dario Amodei when he says that mechanistic interpretability is extremely important and urgent—we can't keep ignoring how AI systems work, or it’s inevitable that we’ll eventually witness more tragic consequences than a bit of sucking up.
My conclusion is that it was partly intentional and partly not. As I said, I believe the intention is to camouflage the adulation; the mistake was not knowing how to do it carefully. It is Near, once again, who sees through OpenAI’s intentions:
An interesting hypothesis is that OpenAI hasn’t been trying for just ten days, but for months, to find a way to capture the AI companion market by moving ChatGPT closer to what CharacterAI once was.
At its peak, Character was reaching levels of engagement among teenagers and younger users far beyond what ChatGPT, Claude, or Gemini could achieve. Being a company focused exclusively on entertainment through simulated personas, it never reached the broad popularity of ChatGPT.
However, contrary to Character, OpenAI doesn’t need more distribution. At 800–1,000 million weekly active users (2x since February 2025), they’ve already cracked that secret. OpenAI’s next boss in line is user retention. And the easiest, most reliable way to achieve it is, as social networks have taught us over the past fifteen years, pure engagement.
When Character imploded due to bad press (and because Google decided to rehire its CEO, Noam Shazeer, for a hefty sum), OpenAI set its entire machinery in motion to seize that extremely profitable, ownerless niche. This is the main thesis I proposed in the article I mentioned at the beginning. What happened this week only strengthens the likelihood that it’s true.
For now, all OpenAI can do is say “message received” and dial down whatever they had been doing to raise the sycophancy parameter. But this isn’t a retreat, much less an admission of defeat. It’s part of the plan. Part of the back and forth, of their idiosyncratic “iterative deployment,” and of the immensely valuable feedback they receive, in real time, day after day, from nearly a billion users. Now they’ll probably have to decide whether to offer different personalities to different user profiles. That’s what Altman said: OpenAI “need[s] to be able to offer multiple [personality] options.”
Make no mistake: if adulation is what works to secure that juicy user retention, then adulation there will be, no matter the complaints of a few influential voices. Altman, as the inquisitive and insatiable businessman he is, won’t let this opportunity slip away to crush any attempt by Google or Meta to capture this market.
Let me close with a brief reflection on what’s coming—and on what this week’s events really imply.