Progress in open-source (OS) generative AI (particularly language models, LMs) has exploded in recent months. As a consequence—and with the help of desperate internal confessions—people believe it has become a threat to incumbent companies like Google and Microsoft, and leading labs like OpenAI and Anthropic.
I think I speak for the majority when I say that, leaving aside that open-sourcing AI systems indiscriminately could exacerbate risks, we like the idea. OS has a strong positive connotation; it’s not that people dislike Big Tech but that we like when The People has the opportunity to outdo the powerful.
Here’s the story of how that feeling of anticipated victory came to be and why, sadly, it’s mistaken: OS AI is thriving but remains unable to dethrone the leadership of proprietary models—and unable to develop the resources to do so in the future.
Open-source AI is a force to be reckoned with
In early March, Meta’s LLaMA—the first decent OS LM—leaked online. OS developers accepted the gift and initiated attempts at reproducing OpenAI’s success on the open. In the next three months, the community lived through what Simon Willison called “a Stable Diffusion moment” for LMs—a wave of innovative research and creative development that promised to compete against ChatGPT.
Consulting firm Semianalysis shared a document where a Google engineer argued Google’s real competitor isn’t OpenAI but the OS community. Without a moat, neither of them could dream to resist “an entire planet’s worth of … labor.” For instance, the community managed to solve the scalability problem—now it’s trivial to fine-tune a pre-trained model fast and cheaply with a new technique called LoRA. That’s “a big deal” and “underexploited inside Google.” The solution to avoid this threat, he concludes, is to embrace OS.
The article is an illuminating read that convinced many that OS is indeed winning the race. But I believe the conclusion doesn’t follow from the premises. Below, I lay out three reasons why that’s the case—why OS AI, despite being a force to be reckoned with, is far from being a threat to incumbents or wealthy startups.
The false promise of imitating proprietary LMs
First reason: a problem with the algorithms.
If we take the best OS LMs (those intended to compete against ChatGPT), we encounter a clever technique first proposed in the Self-Instruct paper: improving models by fine-tuning on their own output (or on that of better models). Two of the best-known OS LMs—Alpaca and Vicuna—used self-instruct with GPT-3.5 and ChatGPT, respectively.
But a new study from UC Berkeley analyzed this method and found that “there exists a substantial capabilities gap between open and closed LMs.” To bridge that gap the best chance is with “more capable base LMs.” Models fine-tuned this way adopt the style of the model they imitate but don’t improve in “factuality.” Alpaca and Vicuna can’t really compete with GPT-3.5 and ChatGPT.
This is a hard blow to the OS AI community. To entice users with cheap, customizable chatbots that are sufficiently capable, they’ll have to take on the daunting challenge of pre-training a state-of-the-art LM. I don’t see this happening any time soon; the leak of Meta’s model is the only reason any of this happened at all—the OS community did what it could with the scraps of an incumbent.
The limits of on-device inference for LMs
Second reason: a problem with the hardware.
Building small and cheap customized LMs makes sense if you want to run them locally and privately on-device (on a computer or smartphone); that’s OS’ appeal to serious users. But there’s a limit to how good those can get.
Dylan Patel wrote a great article for Semianalysis on the memory wall and data reuse problems for on-device LM inference. Generating a token (e.g., when ChatGPT outputs one word) is a computationally costly process. Server-hosted LMs can be parallelized across queries the minimize the expense. If you’re the only user of your chatbot—presumably you are—you can’t leverage this shortcut. The costs soon become unaffordable for high-quality LMs.
Due to hardware limitations and trade-offs that won’t be solved any time soon OS AI’s best asset—private, cheap, custom models—have a hard upper limit. The LMs they can offer can’t match the performance of proprietary server-hosted ones (like Google’s or OpenAI’s) however great improvements they may invent on the algorithmic side.
The moat of incumbents in action
There’s a final reason on the business side of things: If winning is a matter of reaching the largest number of users, incumbents don’t have competition at all.
The leaked document says Google—and by extension, Microsoft—has no moat. Well, that’s deeply wrong. They have moats. Not just money. Not just talent. Not just resources, influence, and power. All that too, but their true moat is that they design, build, manufacture, and sell the products we use.
And they are executing a ruthless strategy to make the most of that advantage. Microsoft has turbocharged Bing and Edge, 365, and now Windows. Google has enhanced Search and Workspace (including Gmail and Docs). Incumbents in other areas are doing the same. Adobe Firefly, now powering Generative Fill on Photoshop, is a clear example on the image generation front. On the hardware side, Nvidia—the undisputed leader—has optimized its top-notch H100 for LM inference.
The innovator’s dilemma portrays incumbents as beatable: Challengers with a solid will to pursue risky innovation could, under the right circumstances, overthrow them. But let’s be frank here; we’re not living under those ideal conditions: generative AI happens to fit perfectly with the suites of products that Google and Microsoft and Adobe and Nvidia already offer. They create the very substrate on which generative AI is implemented.
OpenAI and DeepMind never had a chance to succeed as challengers. The OS community even less. They can’t offer stuff people would prefer to use over Google’s and Microsoft’s products. This is even more determinant than I’m making it sound. Even if Google and Microsoft were to open-source their best AI and allow the OS community to flourish on top of freely-shared innovation, they’d still keep the moat of all moats: That who creates and sells the goods owns the world.
Generative AI isn’t but a new cog in the complex machine that governs our digital lives and it’s slowly turning into an add-on to the incumbents’ hegemony.
Hi. I’m not an AI expert and had not done much ML hands on. I’m just wondering, in your opinion, whether the first party data that google et al holds actually confer significant advantages to them. If so , how significant?
I liked the article and it's really well-written. I just wish I liked the conclusion it reaches!