9 Comments
User's avatar
⭠ Return to thread
Shawn Fumo's avatar

Yeah I think we just don’t know the implications of 4.5 yet. It is interesting to note that the lab heads were wrong that just going up a size or so would fix hallucinations, etc. But the real question now is how much better it works for RL.

So I consider GPT-5 to be the actual test. That will probably tell us a lot about where everything stands in terms of scaling. And especially how it compares to smaller models with more refined RL that may have come out by then.

Expand full comment