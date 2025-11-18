The Algorithmic Bridge

Sol
3h

As shown on the screenspot pro benchmark, Gemini seems to be better than the other LLMS at image understanding. I think this is why it’s good at ARC. Even though you feed the LLMs ARC1 in text, the visualisation helps. Like when I see the puzzles it becomes a lot easier, so maybe, by allowing Gemini a better ‘visual brain’ it also does better.

As far as I know nano-banana is the best AI image generators/editor. AND Google is doing interesting video -> playable-world stuff. My theory is that they have figured out a way for better knowledge transfer between these models rather than just bolting image models and LLMS together.

One other question that stands out is how is Anthropic still so good at coding? Even with all these advances anthropic stays ahead in coding benchmarks and often seems to be the preferred choice for developers.

Donato Mangialardo
2h

Too much tech jargon...

So I asked G3:

Here are the three massive shifts happening right now:

1. Coding: From "Ingredients" to "The Meal" 👨‍💻

The Old Way (Gemini 1.5): You ask for a Pomodoro timer. The AI spits out 50 lines of Python code in a grey box. You have to copy it, paste it into an editor, run it, and debug the errors yourself.

The Gemini 3 Way: You ask for the timer. The AI builds and runs the app instantly right in the chat window. You click the buttons, use it immediately, and can say "make it blue" to update it in real-time. No copy-pasting required.

2. Planning: From "Walls of Text" to "Visual Dashboards" 🗺️

The Old Way: You ask to plan a 3-day trip to Rome. The AI writes a long, dry bulleted list of text. You have to read through it and manually Google the locations to see if they look good.

The Gemini 3 Way: The AI treats your screen like a canvas. It generates a visual magazine layout with an interactive map, photos of hotels, and a clickable schedule. You can tap a hotel to swap it out without ever leaving the chat.

3. Logic: From "Fast Answers" to "Deep Reasoning" 🧠

The Old Way: You ask a tricky riddle or a complex data question. The AI rushes to answer in 2 seconds, prioritizing speed over accuracy. It sounds confident, but it often misses the nuance or hallucinates.

The Gemini 3 Way: The AI pauses. It actually "thinks" first (simulating a chain of thought). It checks its own work—"Wait, that calculation looks wrong, let me convert the currency first"—and then delivers the solution.

The Takeaway: The friction of "translating" AI answers into real work is disappearing. We aren't just searching for information anymore; we are generating working solutions.

It can do much better than that

