OpenAI o1: 10 Implications For the Future
Just in time for a new wave of model releases: GPT-4.5, Opus 3.5, Gemini 2.0
You may remember the 7,000-word monster post I published in September about OpenAI’s new model family, the o1 series (o1-preview and o1-mini). I want to remark two things about it.
First, o1 is the most primitive version of a new kind of AI. It makes no sense to judge a new paradigm from the value of its seed. Let it blossom. Let it sprout.
Second, the last section of that article was paywalled. It was “10 implications for the long-term future.” (Note the intentional “long-term” in there!) I believe that section deserves to be open to the public so I’m de-paywalling it below.1
To give you the necessary context—so you don’t have to re-read a 25-page diatribe—those ten implications are, loosely, the answers to the questions I raised in the introduction of that post:
What does a reasoning AI mean for generative AI (is it generative at all)?
How will users relate to and interact with AI models that can think?
What can reasoning models do when you allow them to think for hours, days, or even weeks?
How will reasoning models scale performance now as a function of compute?
How will companies allocate compute across the training-test pipeline?
What does all of this mean for the end goals of AI?
How does this relate to GPT-5 (if at all)?
So here you go, 10 implications for the long-term future of the OpenAI o1 series, the new test-time scaling laws, and the new paradigm of AI that can reason (with a few edits for clarity). To those of you who’ve read it, let it serve as a reminder of what’s ahead.
What is happening is larger—much larger—than the models OpenAI has released. It goes beyond the scaling laws, beyond the new paradigm, and beyond generative AI. This section is both the summary of today and the story of tomorrow.
Generative AI as the leading AI paradigm is over. This is perhaps the most important implication for most users who don’t care about AI beyond what they can use. Generative AI is about creating new data. But reasoning AI is unconcerned with that. It is intended to solve hard problems, not generate slop. Furthermore, OpenAI is hiding the reasoning tokens. The others will as well. You can still use generative AI tools to generate stuff but the best AIs won’t be generation-focused but reasoning-focused. It’s a different thing whose applications we don’t yet understand well.
The era of the chatbot is also over. People will still use them to chat casually but they won’t be the best AIs anymore. AI companies’ efforts will be spent elsewhere. OpenAI, DeepMind, and Anthropic will explore and scale the reasoning route instead, relegating the chatbot route to a secondary priority. As long as the current chatbot services keep providing them a recurring source of income, they won’t allocate much resources to improving them beyond what the market demands.
Humans will increasingly feel less and less equal to AI (eventually inferior) across tasks for which we’ve always considered ourselves uniquely suited. The idea of the user having a relationship of equals with AI (e.g. for work) now belongs to the past. Instead, we will witness, in awe or terror, how AIs go deeper, broader than we can follow. In the past, we were the architects and constructors of the world but then machinery happened, and we stopped being the constructors. We’re about to stop being the architects as well.2
The old scaling laws (all compute dedicated to pre-training and post-training) have been replaced by the new scaling laws (compute distributed between training and inference so that the models can reason in real-time). What happens from now on is unpredictable using the commonly known heuristics we’ve been applying for the last four years. Time to update (if OpenAI decides to generously give us more details on how).
OpenAI has finally merged the two most important paradigms from the last 20 years of AI research: Large language models (OpenAI’s GPTs) and deep reinforcement learning systems (DeepMind’s Alphas). That’s why they finally decided to move on from the GPT name. In that sense, the o1 series restarts the company’s research path after five long years.
The emergence of AI-rich and AI-poor classes of users. AI models will become expensive faster than companies can find optimizations to reduce costs. Soon enough, new pricing tiers (already hinted at) will appear that only a few privileged will be able to afford to obtain the benefits of cutting-edge AI models. The rest will belong to the AI-poor underclass.
Everything we think we know about AI just changed. Roon, the creative, esoteric, spiritual anon researcher at OpenAI says this: “we will have to rewrite all the common public debate dialogue trees about plateaus and scaling and whatnot. about whether error in autoregressive models is compounding or self-recovering. whether language can lead to true intelligence. where the line for general intelligence is.” I agree.
There’s no value proposition for users for this new kind of AI. Most people won’t know what to do with AI that can reason. That’s the hard truth, which reveals more about humans than AI. It’s not trivial to figure out where to apply an AI that can think for an entire day on a single problem.
This is not a ChatGPT moment. It’s not accessible but profound. Most people will fail to realize how much. ChatGPT’s main trait was its accessibility (free, easy to use, intuitive). GPT-3’s main trait was its surprising capabilities—but you had to work hard to find them. o1 is more like GPT-3 so most users won’t understand its value (especially given the wait times per query and the current messaging limits).
How does this relate to GPT-5? The new foundational model is still coming. o1 is based on GPT-4o, which means it has a lot of room for improvement if they change the underlying LLM. GPT-5 will make use of this new paradigm but again, only a narrow audience will appreciate the value. However, you should still try to make the jump now. o1 is for science and math and that may not be all that useful for you. But the end goal is JARVIS, then AGI, then—who knows. You may be tempted to think it’s not worth it to switch now but that’s a shortsighted choice: OpenAI and the others will continue developing the new paradigm because that’s what has a chance of fulfilling the field’s ultimate purpose.
One reason why I find it worthwhile to share this section with you now, a month after OpenAI announced o1, is because I’ve heard rumors of imminent releases: OpenAI GPT-4.5, Anthropic Opus 3.5, and Gemini 2.0 are what people talk about, although I can confirm nothing. Anyway, whether this week, next week, or in two months, this new wave of models is the prelude to the GPT-5 era, which will be suffused by the implications I enumerated.
In no time, as I predicted in my article “GPT-4: The Bitterer Lesson,” we won’t be able to remain sense-makers (I also touched on this in a more optimistic piece, entitled “Human → Superhuman → Ultrahuman”). The world was always too complex to fathom in its entirety but now its incommensurability will unfold before our eyes, in the words of minds beyond our own. Dario Amodei, Anthropic CEO, seems to agree.
I enjoyed that article and after having a month to play with it, I have a a few things to reflect on. As far as LLM being a worthwhile technology, it has certainly extended my abilities and has collated information better and faster that I could, had I had the interest to pull together myself. For example, I took several scientific papers regarding aging, chronic diseases, and associated transcriptomes and proteomes. Then I cross-referenced to the associated disease state mechanism. Then I had it search for nutriceuticals and pharmaceutical which might help mitigate the effect. Finally, I had it create a .csv file which then i massaged a bit further. 4o did most of the work, though interestingly there was a point where it gave comparative output against o1 and asked which I preferred. The o1 output was more complete, but at an additional 80 seconds of processing, was cost worth the marginal improvement? Possibly, depending on the importance of the project. Is AI faster than me, more thorough than me? Yes, particularly when parsing information outside of my native senses. Can it help me do something that would be totally outside of my abilities? As an example and a proof of concept, I have tried to write a novel. I am not a great writer, especially not fiction. But AI created a totally engaging prose style which would be impossible for me. In creating a plot and fitting themes for the genre, AI gave me some ideas. I asked it about their originalities and it told me. I merely blended a few together. I did the same for setting, character, conflict, etc for elements of a fiction. My wife seemed pleased with the output, especially in voice mode. (Not that we are a critical audience.) But it was a fun and enlightening experiment. So no matter what the naysayers of AI think or how they want to move the goal posts, this really is a generational technology which can change industry. Do I think it can replace me in its current iteration? Probably not during my career. Then again, I think the scaling law may have been maximized; however, it seems the AI scientists have yet to run out of strategies to improve the technology.
Thank you for this incredible article, Alberto. So, we will live among aliens that 'we' created ourselves. Or, rather, folks with nothing to do with its creation must accept what a small portion of humanity has created for everyone else. I love technology, but I have concerns about this technology existing in the hands of humans, who have been too proud, aggressive, and moved by greed. You wrote, "OpenAI and the others will continue developing the new paradigm because that's what has a chance of fulfilling the field's ultimate purpose." I wondered what this meant exactly. Thanks again for your expertise and sober remarks about the fascinating times (for better or worse) we are living in.