The computing scale of Grok over Deepseek does not reflect the quality measured in the benchmarks. The prowess of the Chinese engineering is superb at a scale 10 or 100 or even a 1000x over the USA companies. I believe Elona has lost the race. Too little, too late.
The computing scale of Grok over Deepseek does not reflect the quality measured in the benchmarks. The prowess of the Chinese engineering is superb at a scale 10 or 100 or even a 1000x over the USA companies. I believe Elona has lost the race. Too little, too late.
I hate that the misinformation of "DeepSeek trained its model with $5 million" went this far that readers of this newsletter are still confused. That's a terrible damage to the knowledge of many people and the epistemic hygiene of AI-adjacent circles. I debunked it twice during that crazy week but I guess it wasn't enough. You can go read the articles to understand why DeepSeek, although more optimal than the US models (because they had to, as I explain), is nowhere near 100x or 1000x more.
Thanks for your response. Now, let's suppose DeepSeek has 10,000 H800 at their disposal. And Grok has 100,000 H100. Grok , in that matter alone, should have at least 10x more power and results. Now, thrH100 have, at least, 5x in performance compared to H100. That means 50x more compute power. And then the hours of training for DeepSeek are in the 3000 numbers compared to the millions of Grok. Therefore a difference of at least 100x to 1000x more power and the difference in your benchmarks are a mere 10%? Again, the engineering prowess of the DeepSeek team is awesome. Elona should bid 100 billions for the DeepSeek team, nor for OpennAI.
I understand your point better now. But you're making a set of assumptions that are not true. Performance doesn't scale linearly with compute. It scales but they wish it did that well. Besides, in my article I say that xAI probably didn't do as much optimization as DeepSeek because they didn't have to. That doesn't mean they can't or don't know how - DeepSeek published their methods and results! What this means is that xAI will probably get a much larger upside once they introduce the algorithmic tricks DeepSeek did. On the other hand, you generalized to "US companies" and although that's possibly true for xAI, it isn't for OpenAI, Anthropic or Google (whose latest models are even cheaper than DeepSeek's). With this what I want to say is that the state of the art is *international*, not different in US vs China.
The computing scale of Grok over Deepseek does not reflect the quality measured in the benchmarks. The prowess of the Chinese engineering is superb at a scale 10 or 100 or even a 1000x over the USA companies. I believe Elona has lost the race. Too little, too late.
I hate that the misinformation of "DeepSeek trained its model with $5 million" went this far that readers of this newsletter are still confused. That's a terrible damage to the knowledge of many people and the epistemic hygiene of AI-adjacent circles. I debunked it twice during that crazy week but I guess it wasn't enough. You can go read the articles to understand why DeepSeek, although more optimal than the US models (because they had to, as I explain), is nowhere near 100x or 1000x more.
Thanks for your response. Now, let's suppose DeepSeek has 10,000 H800 at their disposal. And Grok has 100,000 H100. Grok , in that matter alone, should have at least 10x more power and results. Now, thrH100 have, at least, 5x in performance compared to H100. That means 50x more compute power. And then the hours of training for DeepSeek are in the 3000 numbers compared to the millions of Grok. Therefore a difference of at least 100x to 1000x more power and the difference in your benchmarks are a mere 10%? Again, the engineering prowess of the DeepSeek team is awesome. Elona should bid 100 billions for the DeepSeek team, nor for OpennAI.
I understand your point better now. But you're making a set of assumptions that are not true. Performance doesn't scale linearly with compute. It scales but they wish it did that well. Besides, in my article I say that xAI probably didn't do as much optimization as DeepSeek because they didn't have to. That doesn't mean they can't or don't know how - DeepSeek published their methods and results! What this means is that xAI will probably get a much larger upside once they introduce the algorithmic tricks DeepSeek did. On the other hand, you generalized to "US companies" and although that's possibly true for xAI, it isn't for OpenAI, Anthropic or Google (whose latest models are even cheaper than DeepSeek's). With this what I want to say is that the state of the art is *international*, not different in US vs China.
And too expensive!