Buckle up, devs. OpenAI's new toys just dropped.

I am used to the rapid progress in AI front, but I was still surprised when I saw what OpenAI pushed out this week. They’ve unleashed a whole arsenal of new models that are about to make your code sing and your agents dance. Let me break it down for you.

The Reasoning Monsters: o3 and o4-mini

Remember when we used to dream about models that could actually think? Well, dream no more. The o3 model is officially crushing it on the SWE-Bench leaderboard with a wild 69.1% score, making it the undisputed champion for coding tasks. I’ve been playing with it since yesterday, and trust me, this thing understands code like it wrote the compiler itself. Its little sibling, o4-mini, is the speed demon of the family – perfect when you need solutions yesterday but don’t want to mortgage your apartment to pay the API bill. Some insider tips from my testing:

  • Skip the Chat Completions API and go straight for the Responses API
  • Enable those reasoning summaries – watching the model’s thoughts stream in real-time is oddly satisfying
  • The model actually remembers its reasoning between calls (finally!)

The tier system is what you’d expect: o4-mini for tiers 1-5, while o3 needs tier 4-5 access. But here’s a hack – if you’re on tiers 1-3, just verify your org and you’re in the o3 club. Oh, and don’t sleep on that new “Flex processing” option. It’s basically the economy class ticket – cheaper tokens in exchange for longer wait times. Perfect for those background tasks where you’re not exactly counting milliseconds.

GPT-4.1: The Developer’s Best Friend

The GPT-4.1 family just landed too – with three siblings in tow: standard, mini, and nano. I’ve been testing them all week and they’re clearly tuned for us devs – way better at following detailed instructions and making those function calls work seamlessly. The most impressive part? These babies can handle up to 1 million tokens of context. I fed one of my Vue projects to GPT-4.1 yesterday – the entire codebase – and it navigated the dependencies like it had built the thing itself.

Tools That Made Me Smile

OpenAI Agents SDK: This enables you to build agentic AI apps in a lightweight, easy-to-use package with very few abstractions. Agent frameworks are now settling in to a common architecture. Codex CLI: This might be my favorite drop of the bunch. It’s an open-source coding agent that lives right in your terminal. Just describe what you want to build, and watch the magic unfold. Works with all the new models, naturally. Evals API: No more manual testing hell! You can finally programmatically define, automate, and iterate on your prompts. This alone saved me three hours yesterday.

Voice Is Getting Scary Good

They’ve also dropped three new audio models that make Whisper look like a relic. The new TTS model takes instructions on how to speak – I had it narrate documentation with different personalities yesterday, and my team couldn’t stop laughing. The updated Agents SDK now supports audio too, so those voice agents we’ve been planning just got a whole lot easier to build.

Free Tokens (Because Who Doesn’t Love Free?)

If you’re eligible (check your dashboard), you can score free usage by sharing your prompts and completions with OpenAI. This offer runs through April, so might as well cash in while you can.

Resources Worth Bookmarking

  • OpenAI.fm (their TTS demo) is now open source
  • Voice Agents SDK sample app
  • Starter app for the Responses API
  • Computer Using Agent sample
  • Customer Support Agent with Human in the Loop demo

What are you building with these new toys? Hit reply and let me know – I’m especially curious about any Agent implementations you’re cooking up.