What Happens When AI Gets Too Good at One Thing
Thoughts on Claude Mythos
Hey there, I’m Alberto!👋
Each week, I publish long-form AI analysis covering culture, philosophy, and business for The Algorithmic Bridge. Paid subscribers also get Monday how-to guides and Friday news commentary. I publish occasional extra articles and essays. If you’d like to become a paid subscriber, here’s a button for that:
It seems that, seven years later and a lot of headaches, the AI industry has come full circle.
In 2019, OpenAI withheld the newly announced GPT-2 from public release. They appealed to the presumed dangers of a system whose full extent of intelligence and capability was yet unknown. To this day, I’m unsure if that was merely a marketing gimmick—a successful one at that—or if they were genuinely worried. I’m inclined for the latter; after all, the GPT paradigm was novel, and it had already showcased unexpected signs of proto-AGI: it could talk!
Fast forward to 2026, many of those same OpenAI researchers are now at Anthropic, which has announced that it’ll hold back Claude Mythos (preview) from public release (system card here).
History rhymes, but it doesn’t repeat.
The difference is that whereas GPT-2 marked a display of caution born out of ignorance—“it writes well; could be used for disinformation or propaganda or whatnot, we just don’t know” (OpenAI released it six months later)—Anthropic’s messaging suggests the opposite is true for Mythos: it’s not so much that the company doesn’t know what is capable of but that it knows exactly what it can do.
If you are choosing to leave a ton of money on the table by not releasing your latest and most powerful model to consumers and enterprise clients, you must have a good reason to do so: For those who, like me, don’t know much about software security, Mythos is like a “nuke for code,” if nukes could also be used for good.
Given a piece of software—a browser, an operating system, whatever—Mythos, which by the way destroyed every coding and agents benchmark out there, can find old, immediately exploitable vulnerabilities that not even the developers knew existed, with minimal human steering. And then it can go and figure out how to exploit them. From this moment, no software system in the world is safe to the extent that this technology is possible (in practical terms, only 3-5 players will be able to build this kind of system in the next few months, and none outside the US).
Anthropic’s post reads:
AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities. . . . Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser. Given the rate of A.I. progress, it will not be long before such capabilities proliferate, potentially beyond actors who committed to deploying them safely. The fallout—economics, public safety and national security—could be severe.
That’s the nuke part.
The “reverse-nuke” part is that the same capability works the other way around: if you can find the hole, you can patch it.
To that end, Anthropic has launched Project Glasswing, a coalition that includes Amazon, Apple, Google, Microsoft, NVIDIA, and many other partners concerned with the maintenance of critical software infrastructure and cybercrime. They all have access to Mythos with one goal: find and fix the vulnerabilities before someone else’s (China’s) model finds them first.
Goes without saying that you won’t get access to Mythos for the foreseeable future, but you understand why now.
You may still object, though: “They are centralizing AI! They want to keep the powerful models for themselves! I’m not a malicious actor, why can’t I get access?” Yes, you are correct. But, did you really expect them to sell near-AGI models as a monthly subscription product? Mythos is not a Substack newsletter.
The reason I compared it to nukes is to drive this point home: do you think it’d be preferable if anyone could buy the materials and the manuals to build a homemade DIY miniature nuclear weapon that could obliterate a small city? Maybe not. You should think of narrow superintelligences like Mythos—superhuman at one single thing—as AI’s equivalent to nukes.
Philosophical dilemmas in the real world all have this knack for shutting down simple solutions. What would you choose, to sacrifice freedom in the name of safety or to sacrifice safety in the name of freedom? Sometimes it’s not so easy to decide.
In any case, because Anthropic is a private company, “you don’t get to weigh in on this,” as the OpenAI executive told employees regarding the use of their models by the US government.
(I’m not positioning myself in favor of or against, just saying things as they are.)
Anyway, because we don’t know much else about Mythos (besides the system card, the red teaming docs, and the project Glasswing announcement), I want to dedicate the rest of this post to four questions that I touched on in passing above that I think people will want answered (meaning, that I myself want answered).
Isn’t this just Anthropic doing marketing?
Why did they make public the decision?
Why won’t they give good people access?
Is Mythos a narrow superintelligence?
Let’s start from the first one and work our way down the list.





