Everything except permission

Noam Shazeer co-authored “Attention Is All You Need” at Google in 2017 – the paper that ended the era of recurrent neural networks and made every large language model that exists today possible. He left Google to found Character.ai. Google paid roughly $2.7 billion to get him back in 2023. This week, he announced he’s joining OpenAI.

Three moves. Google was the origin point of all three.

What Google had

Shazeer’s specific contributions to the transformer paper were not peripheral. He proposed scaled dot-product attention and multi-head attention, the core mechanisms that made transformers work at scale. These are not theoretical footnotes. Every token generated by any LLM today runs on something that traces back to that work.

Google also has DeepMind, TPUs purpose-built for training at scale, YouTube’s video corpus, Gmail’s email corpus, and Search’s index. By any accounting of raw inputs to AI capability, Google has more than any competitor.

The permission problem

A line that keeps circulating in discussion around this move: “at Google, you can have everything except permission.”

I have watched this pattern in large organizations often enough to recognize it immediately. The company accumulates talent and resources, builds processes to manage the complexity that comes with scale, and those processes gradually shift from tools for getting things done to the primary source of organizational power. Approvals, reviews, sign-offs: these start as coordination mechanisms and end as the actual product for the subset of people whose careers depend on owning them.

When that happens, the people who joined to build something find themselves spending most of their time navigating people who didn’t join to build anything. They leave.

Why this keeps happening at Google specifically

Google has done this three times with the same person. Shazeer built transformers there, left, they paid billions to bring him back, now he’s at a competitor. Each exit is individually explicable. Three exits suggest something structural.

The structural thing is probably that Google’s size makes it difficult to give a small team the autonomy they would get at a 300-person company. Every decision about model architecture eventually touches Search, or Ads, or some other product with nine-figure revenue tied to it. The review processes that exist to protect those products also slow down the researchers trying to move past them.

The other reading is that internal arguments about research direction become political rather than technical at a certain scale. You stop winning because your idea is better and start winning based on who sponsored your work. For researchers who care about being right, that’s a more corrosive problem than any amount of paperwork.

What actually creates the moat

The obvious framing here is: talent is the moat, therefore winning talent competitions is the strategy. That’s partially true. Shazeer’s knowledge of what does and doesn’t work at scale is not in any paper. It’s twenty years of building these systems and knowing which implementation choices matter.

But the talent concentration reading misses something. The real moat is the conditions that make the best builders productive. A researcher at a place where they can test an idea and ship it in the same week does more than the same researcher wrapped in six months of approvals.

The question every organization running an AI initiative should be asking is not “do we have the right people?” It’s whether the people they have can actually move.

OpenAI will face the same problem as it scales. Microsoft almost certainly already does. Every organization does eventually. Google just got there first, and is paying for it in a particularly visible way.