ChatGPT Killed the Old AI. Now Everyone Is Rushing to Build a New One
AI faces three major hurdles right now. The second is an absence of substantial algorithmic improvements since the transformer in 2017
In the first part of this series, I wrote about the global GPU shortage that’s forcing AI companies to slow down their short-term plans. Leaders like Google, OpenAI, and Meta are struggling to get the share of H100s they need to train their next-generation AI models in a reasonable timeframe.
As models grow in size, complexity (multimodality is in vogue), and computing requirements, chipmakers like Nvidia are trying to keep up with demand but their capacity increases slowly in comparison. GPT-5 may have to wait a bit longer than expected (Google DeepMind’s Gemini seems to be on schedule, likely to be announced after the summer).
The hardware shortage, although probably the most frustrating hurdle (AI companies can only wait as they don’t design nor build the hardware themselves) isn’t the only one: The other two fronts where companies can make progress—algorithms and data—have been somewhat drained too.
Today I will explore algorithms.
It would be a pessimistic anti-hype exaggeration to say that we’re living through an algorithmic drought (the transformer keeps on giving) but the inherent limitations to this uber-successful paradigm which gave birth to ChatGPT and GPT-4 are far from solved despite being well-known.
Algorithmic constraints on top of the hardware shortage are forcing prominent researchers and companies like OpenAI and Google DeepMind to explore new ideas and dust off old ones in search of the next algorithmic breakthrough. Can small and iterative improvements to existing algorithms bring a new age of broad and profound progress or should the field hope for the next “transformer moment”?