The vendor slide that doesn't survive your accountant
The deck says it plainly: "Saves each rep 47 minutes a day. At a $68,000 salary, that's $11,200 per person per year." Multiply by 30 people and you've got a $336,000 return on a $40,000 tool. Sign here.
Now go ask your bookkeeper to find that $336,000. They can't. Nobody's salary went down. You didn't lay anyone off. The reps are still on payroll for the same amount, doing roughly the same volume of work in a slightly less annoying way. The "savings" exist only inside the multiplication. That 47 minutes scattered across a day — six minutes here, eleven there — does not reassemble itself into a redeployable hour, let alone a check.
This is the trap underneath most AI business cases, and it's spreading fast because adoption is spreading fast. The RSM middle-market AI survey shows how broadly mid-market firms have rolled these tools out — which is exactly why the ROI math now matters more than the tool selection. Wide usage with phantom returns is how a CFO ends up cutting an AI budget two quarters after approving it. The fix isn't to stop counting time. It's to stop pretending saved time is the finish line. Saved time is the start of a question: did that capacity turn into something the business can actually bank?
Five returns that show up in the numbers — and the test each one has to pass
Time saved is the input. These are the outputs that an operating review will actually accept. Each comes with a test you should refuse to skip.
1. A cost that disappears. A contractor you stop renewing. A seat you don't refill. A tool the AI replaces. Test: name the invoice or the headcount line that gets smaller. If nothing shrinks, this category is zero.
2. Capacity that got redeployed. The freed time has to land somewhere measurable — more accounts per rep, a project that finally got staffed, a queue someone now covers. Test: what did the person do with the hour, and can you see it in their output? The OECD SME AI adoption report is worth reading here precisely because it separates having the tool from changing what gets done with it — and only the second one pays.
3. Cycle time that moved a result. Faster proposal turnaround only counts if a downstream number reacts — win rate, response time, days-to-close. Test: a before/after on the business metric, not the drafting speed.
4. Quality and rework. Fewer corrections, fewer escalations, fewer QA kicks. Test: defect or rework rate from before vs. after. "It feels cleaner" is not data.
5. Revenue or retention response. A deal pattern, a renewal lift, a churn dip you can attribute. Test: the sales or retention metric, with the caveat that this is the hardest to isolate and the easiest to fake.
And the line every honest model carries: implementation, review time, training, maintenance, and the hours spent catching the AI's mistakes. Leave those out and you're not measuring ROI — you're advertising. The San Francisco Fed's small-business AI analysis is a useful reminder that for a 40-person firm the integration drag is proportionally heavier than the license fee, and it rarely makes the slide.
Decide the measurement before you turn the tool on
Here's the move most teams get backwards: they buy, deploy, and then go looking for proof. By then there's no baseline to compare against, so the case gets argued on vibes — and vibes lose to a budget cut. Set the measurement up front, while you still have a clean "before."
For one workflow, do five things before week one: capture the baseline number (current win rate, current rework rate, current days-to-close — whatever this workflow is supposed to move); pick the single business metric you'll judge it on; log accepted vs. rejected AI output so you can see how much human review it actually demands; track the error rate, because review time is a real cost; and put a 30/60/90-day kill date on the calendar. The point of the kill date is permission — a workflow that hasn't moved its metric by day 90 gets shut off, not defended.
The Deloitte State of AI report keeps pointing at the same gap between how much AI gets used and how little reaches production value, and Gartner expects over 40% of agentic AI projects to be canceled by 2027 — most of them killed by exactly this missing-baseline problem. The projects that survive are the ones that could prove a number changed. Run your first workflow through the AI ROI Calculator with these five categories instead of the minutes-times-salary shortcut, and you'll walk into the operating review with a case that holds. The discipline is the same one that turns "we think we're forecasting better" into 92% forecast accuracy: a number you can defend, not a number you hoped for.