- The AI Optimist
- Posts
- Claude Opus 4.6 and GPT-5.3 Codex Launched hours ago: Why the Craft of Directing AI Matters More Than Which Model You Use
Claude Opus 4.6 and GPT-5.3 Codex Launched hours ago: Why the Craft of Directing AI Matters More Than Which Model You Use
Anthropic and OpenAI both launched major models this week. But the real productivity evidence is in how people direct AI - not which model they choose.
Friends,
your weekly AI briefing is here - designed to help you respond to AI, not react to the noise. No curveballs. No chaos. Just clarity.
You need to become AI Fluent: The next Leaders AI Fellowship starts on 19th February - and it's been completely rebuilt. This cohort is a learn-through-doing experience where you'll build your own personal AI strategy from scratch. Along the way, you'll bring to life prompt engineering, context engineering, relay prompts and meta-prompting - not as abstract concepts, but as practical tools you'll use to create something you'll keep using long after the programme ends. Save your spot.
📰 This was the week that was...
Two major model launches in one week - and a flood of evidence that the craft of directing AI is where the real value lies.
Anthropic released Claude Opus 4.6 and OpenAI launched GPT-5.3 Codex within hours of each other. The models keep getting better. But the more interesting story this week is what people are doing with them.
A developer completed a $50,000 software contract for $297 in API costs using a technique named after Ralph Wiggum. A YC-backed founder published data showing AI agents degrade badly when you feed them too much context - and that the fix is structured workflow, not a bigger model. The creator of pandas declared that developer ergonomics have fundamentally changed because AI agents now compile and test code at 10-100x human velocity. And the FT reported that AI-driven productivity gains are real and accelerating. That's worth paying attention to.
Let's get into it.
🔥 Urgent Priorities
✅ No fires to fight this week
✅ Models are maturing. Workflows are hardening.
✅ The real shift is happening in how people work with these tools, not in which model tops a benchmark.
This isn't a week for panic. It's a week for upgrading how you think about AI productivity.
🎯 Strategic Insight
Tension: Everyone talks about AI productivity. But the conversation has been stuck in two modes: breathless hype about what AI could do, or disappointing anecdotes about what it actually does when you point it at real work.
Optimistic insight: What's changed this week is the quality of the evidence. The FT reported that AI-driven productivity gains are real and accelerating. That's not speculation - it's showing up in the numbers. The Ralph Wiggum technique isn't clever prompting - it's brute-force persistence that produces real, working software. Dex Horthy's context engineering research shows that when you treat AI context like a precious resource - keeping it under 60% utilisation, using sub-agents for noisy operations, structuring work into research-plan-implement phases - you get results that pass expert code review. The cost of doing really is collapsing to near zero. But the craft of directing that doing - that's where value is concentrating.
What's shifting: We're moving from "can AI do this?" to "how do I structure work so AI does this well?" That's a maturity shift. When agents compile and test code one to two orders of magnitude faster than humans, the bottleneck is no longer execution speed - it's the quality of the instructions and context you provide. The people getting extraordinary results aren't using secret models. They're using better workflows.
Why this matters now: The organisations that learn to work effectively with AI agents now will compound that advantage. This is about giving your team leverage - building the craft of directing AI across your entire business.
👉 Takeaway: Pick one repeatable task in your business and try structuring it as an AI workflow this month:
Define clear completion criteria (not "make it good" but "these five tests must pass")
Break the work into research, planning, then implementation phases
Review the research output before letting the AI build anything
Track what worked and what didn't
Share the results with your team
If you want to accelerate this across your organisation, book a demo of the AI Transformation Accelerator - it's built specifically for the craft of directing AI to get real business value.
🤓 Geek-Out Stories
Anthropic released Claude Opus 4.6 and OpenAI launched GPT-5.3 Codex within hours of each other this week. The model race continues, with both releases focused on improved reasoning, coding capabilities and longer context handling.
Why it matters: The tools keep getting sharper. But as the other stories this week show, the difference between organisations getting value and those not getting value isn't which model they use - it's how they structure their work to use it well.
A developer technique called the Ralph Wiggum has become an official Anthropic plugin for Claude Code. Named after The Simpsons character who embodies ignorance, persistence and optimism, the technique feeds the same prompt back into an AI agent every time it tries to stop - creating an autonomous loop that iterates until the job is done. One developer used it to complete a $50,000 contract for under $300 in compute costs. Y Combinator startups are using it to generate entire repositories overnight.
Why it matters: This is what AI productivity looks like in practice - not magical one-shot answers, but persistent, cheap iteration. The technique works best with clear completion criteria and automated testing, which is exactly how well-run businesses already define good work.
Dex Horthy, a YC-backed founder, published research showing that AI agents degrade significantly when context windows exceed 40-60% utilisation - what he calls "the dumb zone." His solution is a structured Research-Plan-Implement workflow where humans review compact research summaries (around 200 lines) before any code is written. The result: features estimated at three to five days by senior engineers completed in seven hours, passing expert review.
Why it matters: This proves that the bottleneck in AI productivity isn't the model - it's how you feed it information. Leaders who understand context engineering will get dramatically more from the same tools as everyone else.
🎨 Weekend Playground
🔎 Try Claude Opus 4.6 on one of your hardest family jobs
This weekend, take the new Claude Opus 4.6 for a spin on something genuinely useful at home. Build a meal planner that accounts for everyone's dietary preferences and what's already in the fridge. Create a tool that coordinates a complex family schedule across school, clubs, work and social commitments. Or set up a budget tracker that actually makes sense for how your household spends.
The point isn't to test the model's limits - it's to experience what directing AI feels like when the stakes are low and the feedback is immediate. You'll learn more about how to structure AI tasks in an hour with your family's chaos than in a week of corporate experiments.
✅ Pick a genuine family pain point - something that takes time and mental energy
✅ Give Claude clear constraints (budget limits, dietary restrictions, time blocks)
✅ Ask it to explain its reasoning, then refine
✅ Notice what worked and what you had to correct
If The AI Optimist helps you think more clearly, forward it to someone else navigating the shift.
If it's not quite landing, hit reply and let me know - I read every message.
Stay strategic, stay generous.
Hugo & Ben
