Google’s TPU Story Just Vindicated Everything I Warned About
A few months ago, I wrote👇that GPUs didn’t win AI because they were the right architecture.
They won because they were the only hardware available.
Convenient, not correct. Powerful, not purposeful. An empire built by accident, not design. Today, with Google’s TPU v7 (Ironwood) and the rollout of Gemini 3.0, that empire has begun to crack. And the deep-dive👇 analysis of TPU v7 doesn’t just challenge the GPU narrative; it dismantles it piece by piece.
https://www.uncoveralpha.com/p/the-chip-made-for-the-ai-inference
Here is the architectural truth the market is ignoring.
GPUs Were Never Built for This
The TPU team openly calls GPUs what they are: chips overloaded with graphics baggage.
To render a game, you need caching, thread schedulers, texture units, and complex branch logic. To run AI, you need none of that. But if you buy an H100, you are paying for all of it, consuming chip real estate, power, and complexity that your model never even touches. Google didn’t build the TPU as a science project. They built it because the math forced their hand. If every Android user used voice search for just three minutes a day, Google would have had to double its global data center footprint simply to keep up.
That isn’t an infrastructure challenge. That is an architectural failure.
The Systolic Difference
While the world was buying “more CUDA cores,” Google went back to physics.
GPUs hit the Von Neumann bottleneck; data moves constantly between memory and compute units. TPUs sidestep this entirely with systolic arrays. Data flows across a grid of multipliers in a single, continuous pass without bouncing to memory. The result is simple: higher performance per watt, lower heat, and dramatically lower cost.
The Ecosystem Trap
I’ve said from day one: CUDA is the moat, not the GPU.The analysis confirms this. The only reason the world hasn’t shifted more aggressively to TPUs is ecosystem lock-in. Engineers learned CUDA in college. Companies fear egress costs. Switching hurts. But look at the scoreboard. TPU v7 (Ironwood) is arguably on par with Nvidia’s Blackwell, with far better efficiency. The accidental empire is now being challenged by a design built on intention, not inheritance.
The Bubble Is Unsustainable
We are building data centers that look like industrial plants, with cooling systems approaching the scale of power utilities. And we are doing this because we are trying to force graphics processors to think. Google solved the problem every hyperscaler is now running into. They proved that purpose-built silicon is the only way to scale AI without burning through energy grids and capital budgets.
A New Architectural Reality
This isn’t the end of GPUs, but it is the end of the myth. GPUs were a bridge, not a destination. The next decade of AI won’t be defined by convenience, but by architecture. And the architecture that wins will be the one built for AI, not inherited from gaming.