• The AI Optimist
  • Posts
  • Two questions every CFO should ask before the next AI renewal

Two questions every CFO should ask before the next AI renewal

GPT-5.5 doubled in price. Claude is throttling on weekday afternoons. Two new frameworks (COMPASS and Tokenomics) now let UK SME leaders answer the question that's been a vibe for two years: what does our AI actually cost, and is it actually worth it?

Friends,

your weekly AI briefing is here - designed to help you respond to AI, not react to the noise. No curveballs. No chaos. Just clarity.

Something I made this week, that I'd love to show you.

Last weekend I went for a walk through the woods around Harpenden, in the sun, talking into my phone. By the time I was back home, Harpenden.AI existed - a vision for making Harpenden the UK's first truly AI-fluent town. I did no machine work on it, only human imagination. The tools I've built in the last six months - what I've started calling my Imaginarium - did the rest.

I'm sharing it because it's a small, real proof of where the cost of doing has actually got to. A vision document, a town strategy, a council pitch and a public artefact - shaped on a forest walk. If you want to see what that looks like in practice, the page is here. If you want one for your business, your school, your trust, your team, hit reply.

📰 This was the week that was...

This was the week the question of "whose AI" stopped being theoretical. China dropped DeepSeek V4, an open-weight, MIT-licensed model with 1 million tokens of context as standard, V4-Flash priced at roughly $0.28 per million output tokens. On the same day OpenAI launched GPT-5.5, and the day after Microsoft exclusivity ended, OpenAI's models landed on AWS Bedrock. Suddenly the menu of "where can I run a serious model" went from short to long.

The geopolitical weather changed too. Beijing ordered Meta to unwind its $2 billion acquisition of Manus. France told every government ministry to plan its exit from Microsoft to Linux by autumn, covering 2.5 million civil servants. The signal is steady: the cheapest, most powerful tools are arriving fast, and where they run, on whose hardware, under whose laws is becoming a board-level question rather than a technical one.

Let's get into it.

🔥 Urgent Priorities

✅ No fires to fight this week ✅ Frontier models are getting cheaper at the bottom, more expensive at the top, and trickier to plan against ✅ A new question is about to land on every CFO's desk: what does AI actually cost, and what's it actually worth?

This isn't a week for panic. It's a week for a conversation you may have been quietly avoiding: does our AI stack make money for us, and does our pricing match how the AI actually creates value?

🎯 Strategic Insight: the unit economics of AI just got sharp enough to answer

For two years, "what does AI actually cost, and what's it actually worth?" has been a vibe more than a number. This week the bill arrived - in two visible ways - and two serious pieces of work landed that close the gap between vibes and a real answer.

Tension: The cost of running AI workloads, after two years of being treated as a rounding error, has become a headline number. OpenAI doubled its API price for the new GPT-5.5 model that launched on Thursday: $5 per million input tokens, $30 per million output, up from $2.50 and $15. OpenAI argues the model is more token-efficient, so the effective cost rises by closer to 20% than 100%, but GitHub Copilot is launching it with a 7.5x premium request multiplier. On the other side of the market, Anthropic is so compute-constrained that since 27 March it has been quietly draining Claude session limits faster on weekday afternoons (1pm-7pm GMT). Some Max subscribers ($200/month) have reported their full daily allowance gone in 19 minutes. As Vin Vashishta puts it in The Tokenomics of Agentic Commerce, "as models scaled in size, inference costs scaled in silence." The silence has ended.

Meanwhile most SMEs are still paying for AI on a per-seat basis - the same way they pay for Microsoft 365 - while the AI doing the most useful work is no longer a tool that helps a named user but an agent that does the work autonomously while no one is logged in.

Optimistic insight: Two complementary frameworks have arrived at exactly the right moment.

On the buying side, Michael Mansard at Zuora's Subscribed Institute has just published the COMPASS framework (Choice of Optimal Metrics for Pricing Agentic Systems and Solutions). Two questions get you to the right pricing model:

  • What's the AI's job? Is it a worker (always-on capacity), a service (specific tasks done), a utility (raw consumption), or a partner (delivering business outcomes)?

  • How attributable is its value? Can you cleanly link what the agent does to a number on your P&L - or is it tangled up with other factors?

Those answers point to one of four pricing structures: per-agent, per-activity, per-output, or per-outcome.

On the operating side, Vashishta's piece introduces Tokenomics - the discipline of designing AI workloads that actually make money. His argument is sharper than the usual ROI hand-wave: every agent has to scale on three dimensions, not two. Reliability and utility are the ones everyone tracks. Profitability is the one that quietly kills initiatives. He cites OpenAI's decision to shut down Sora and Microsoft's pullback on free Copilot features as the same story: even the giants are hitting workloads where the maths doesn't work.

But the upside, when it does work, is significant. Macy's launched "Ask Macy's", an AI shopping assistant built on Gemini, in March. Customers who use it spend 4.75 times more than those who don't. Vashishta says his retail clients are seeing 3x to 6x lifts on similar tools - but only when they're built as proper assistants, not chatbot-shaped marketing.

What's shifting: The smart question is no longer "should we deploy AI?" but two harder, more useful ones:

  • On every workload we're paying for: what's the agent's job, how attributable is its value, and does the pricing model match? (COMPASS)

  • On every workload we're building or running: do reliability, utility AND profitability all scale together as we grow, or does one of them quietly fail? (Tokenomics)

This is Job 2 of the Three Jobs Framework - making room in the P&L using AI - meeting reality. Job 2 isn't "use AI to save costs". It's "use AI in places where the unit economics work, and ruthlessly stop using it in places where they don't". Most SMEs haven't yet drawn that distinction.

Why this matters now: If you only plan around today's invoice, you'll discover the real cost in the renewal. If you only plan around the upside ("Macy's got 4.75x!"), you'll deploy something that costs more to run than it earns. If you instead bring both sides of the question to the same table - what we're paying and whether the workload actually pays - you walk into 2027 with an AI stack that compounds, not one that quietly leaks margin.

And notice how this loops back to the question we started with. You can't be sovereign over a tool that doubles its price overnight, or that throttles you between 1pm and 7pm on a Tuesday. The tokenomics question and the sovereignty question are the same question, asked from two angles.

👉 Takeaway: Before your next AI contract renewal, run this thirty-minute exercise with your CFO and whoever owns operations:

  • List your top three AI vendors and your top three AI workloads (the workloads might be internal, not bought-in).

  • For each, name the agent's job: worker, service, utility, or partner.

  • For each, mark value attribution as low, medium or high.

  • For each, mark profitability honestly: is the workload generating more than it costs to run, including inference? Don't know? That's the answer.

  • Identify one workload to renegotiate (using COMPASS lens), and one workload to either redesign or retire (using Tokenomics lens).

You'll know more about what your AI is actually doing - and earning - than 80% of your peers walking into the same renewal conversations.

If you'd like a hand running this on your own AI stack, hit reply.

🤓 Geek-Out Stories

(Three stories the headlines didn't bring you. The world behind the world.)

1️⃣ AI is starting to learn from physics, not just text

Nature this week covered the rise of world models - AI systems trained not on more text from the internet, but on data about how physical environments actually behave. The argument is that today's large language models are statistical pattern-matchers; world models are the beginning of something different - AI that holds an internal representation of space, time and consequence. NVIDIA's Cosmos, DeepMind's DreamerV3 family, and a wave of new robotics research are all converging on this idea.

Why it matters: This is, in plain English, AI moving from intellectualised observation (predicting the next word in a sentence) to embodied participation (predicting what happens when you push the cup). For an SME leader, the practical implication is not next year - it's three to five years out. But it's the right time to start thinking about what your business looks like when AI can model your operations, your supply chain or your customer journey as a living physical system, not as a spreadsheet.

👉 Action: Pick one part of your business that's currently modelled in a spreadsheet. Ask: if a system could simulate this as a living thing - with cause and effect, not just averages - what decisions would we make differently? File the answer somewhere you'll find it again in eighteen months.

2️⃣ Sony's table-tennis robot beat elite players this week

Sony AI's Project Ace was published on the cover of Nature on 23 April. It's the first robot to compete with elite and professional human table-tennis players in a real, regulation match. It returned spins up to 450 radians per second. It scored 16 direct points after serving against elite players, who managed eight against it. The interesting bit is not the score - it's that it perceives, plans and acts at the edge of human reaction time, in the real world, with real spin and real noise.

Why it matters: Most AI we work with lives in pure text. This is AI in a body, in a room, against a person who is also adapting in real time. That's the same physics your delivery vans, your warehouse, your service engineers, your customers actually inhabit. The cost of doing physical work is going to fall the same way the cost of doing knowledge work has - just three years behind. The leaders who notice now will quietly position before their sectors do.

👉 Action: Ask your operations lead a single question this week: "Where in our business does a person currently bridge the gap between a screen and the physical world?" Write down what you learn. That list is your physical-AI roadmap, three years early.

3️⃣ AI is being used to keep dying languages alive - and it's quietly the most optimistic story in the field

The UN reports an indigenous language disappears every two weeks. This year, research from IBM and the University of São Paulo showed that fine-tuning state-of-the-art translators with surprisingly tiny amounts of data is now producing high-quality machine translation for languages spoken by only a few thousand people. Brazilian indigenous communities are using it to build spell-checkers, next-word predictors, and writing tools in their own language. Project CETI, meanwhile, has been using AI to map sperm-whale "phonetic alphabets" - 18 rhythms, 5 tempos, optional rubato, optional ornamentation, hundreds of distinct codas - opening the question of what whales might actually be saying to each other.

Why it matters: This is what AI for good looks like when it isn't a press release. Tools optimised for the giants have a side effect: when you turn them sideways, they become unreasonably useful for the small, the rare and the threatened. The same architecture that helps a Fortune 500 do call-centre triage helps a community of 20,000 speakers keep their grandmothers' language alive. There's a quiet lesson for SME leaders here too: the best uses of AI are often the small, specific, deeply contextual ones - not the generic enterprise rollouts.

👉 Action: Find one thing in your business that has been "too small to bother automating" - a niche process, a small customer segment, a piece of institutional knowledge held by one person. Spend an hour this week asking what AI could do for just that thing. The answers tend to be surprising.

🎨 Weekend Playground

This weekend, take your phone for a walk and try Merlin Bird ID, the free app from Cornell Lab of Ornithology. Hold it up to a tree, hit Sound ID, and stand still. Within seconds it will tell you which birds are singing around you. Photo ID does the same for what you can see. It works offline. It's free. It's powered by machine learning trained on millions of recordings collected by ordinary people who love birds.

It's the most quietly profound piece of AI I know. The reviews say it better than I can - one user wrote that Merlin gave her a daily mindfulness practice she didn't know was missing.

Why this matters: This is AI doing what AI is best at - quietly, in the background, connecting a person to the living world. No screens to scroll. No prompts to engineer. No models to choose. Just you, the morning, and the birds you didn't realise were already there. Natural intelligence, supported by silicon intelligence. As lived practice.

👉 Mission:

  • Download Merlin before Saturday morning

  • Walk somewhere green - your garden, your local park, the woods, the river

  • Stand still for five minutes, hit Sound ID, and just listen

  • Notice what was there the whole time

  • Bonus: do it before you check your phone for any other reason on Saturday

📢 Share the Optimism

If The AI Optimist helps you think more clearly, forward it to someone else navigating the shift. If it's not quite landing, hit reply and let me know - I read every message.

Stay strategic, stay generous.

Hugo & Ben