Pick one workflow, or you'll finish none of them
Here is the version of the 90-day plan that fails. A growing company gets excited, lists eight processes AI "could help with," buys a tool, and assigns a champion to "roll it out." By day 90 there are three half-configured workflows, a Slack channel of confused users, and nobody who can say whether anything got faster. The calendar filled up. The work didn't.
The fix is unromantic: ninety days buys you exactly one workflow moved from scope to controlled daily use — maybe two if they're close cousins. Not company-wide transformation. One. The whole point of the time box isn't speed, it's sequence. Pick the workflow, lock the baseline, build a rough first version, put it in front of real users, read the evidence, and make the stop-or-scale call. Skip a step and the later steps collapse.
The appetite is real — the RSM middle-market AI survey shows mid-market firms moving fast on AI. But appetite isn't a plan a department manager can run on a Tuesday. Ninety days is long enough to surface the ugly stuff: the data lives in three systems, the people who'd use it don't trust it yet, and someone has to actually check the output. It's short enough that you can't hide behind "we're still exploring."
So before any tool gets configured, write down six things on one page: the workflow, its owner, the source systems it reads from, the review standard for what "good output" means, who gets trained, and the single before-state number you'll measure against. If you can't fill in all six, you don't have an implementation plan — you have a demo schedule.
What each block of days actually has to produce
Phases only matter if each one ships a thing you can inspect. Vague "discovery" eats weeks. Here's the cadence, with the deliverable that proves the block is done.
Days 1-15 — baseline, written down. Map the current workflow as it really runs, not as the org chart claims. Who touches it, what they paste from where, where it stalls, what a mistake costs. Capture the before-number now — average handle time, error rate, hours per week, whatever the metric is. If you measure the baseline in week ten instead of week one, you've lost your ability to prove anything, because you'll be comparing memory to data.
Days 16-35 — a rough working version against real work. Configure or prototype the workflow using approved source material and a small evaluation set of actual cases — not invented examples. The deliverable isn't a polished UI. It's evidence the thing produces usable output on ten or twenty real inputs you already know the right answer to.
Days 36-60 — controlled pilot. A handful of named users, written review rules, logging on every output, and an explicit exception path for when it's wrong. This is where you learn whether people will use it or quietly route around it.
Days 61-80 — widen and harden. Train the broader group, tighten the source rules based on what the pilot exposed, fix the handoffs that broke. The OECD SME AI adoption report is blunt about why this block matters for smaller firms: implementation capacity, not model quality, is the binding constraint. A 40-person company can't absorb five new ways of working at once. One, done properly, is the realistic ceiling.
Days 81-90 — read the evidence and decide. Active users, accepted output, rejected output, cycle time, quality, the value metric from day one. The block's deliverable is a decision: scale it, fix it, or stop it.
"Technically live" is not the finish line — adoption is
The most common way a 90-day plan lies to you: the system is up, the integration works, the dashboard is green, and you call it done. Then you watch the logs and discover four people used it twice and went back to the spreadsheet. Technically live, functionally abandoned.
That gap is the whole problem the Deloitte State of AI report keeps circling: lots of AI activity, very little process change. Activity is buying tools and running pilots. Process change is people doing the work differently and a manager able to inspect the new way. Your day-90 readout has to prove the second thing, not the first.
So judge the quarter on two honest tests. Did adoption actually happen — are the intended users running this for real work, not performing it for the demo? And did quality hold — accepted-output rate steady or rising, rework not creeping up? Both green means you've earned the second workflow. High adoption with sliding quality means you shipped speed and lost trust, and you fix that before you scale. Low adoption means the workflow, the training, or the review rule was wrong, and no amount of additional spend papers over that.
That discipline is also the cheapest insurance against the failure rate baked into hype cycles. When teams need the quarter mapped against their own workflow, that's the work behind the 90-Day AI Implementation Sprint — the same sequencing discipline I've used on a 28,000-user migration with zero downtime, pointed at one AI workflow instead of an enterprise rollout. Plan the 90-day sprint before the next round of AI spend, not after.