The number under the demo is doing a lot of hiding
The screen share is smooth. The assistant pulls a contract, summarizes it, drafts a reply, files it in the right folder. The room nods. Then the slide flips to a single figure with a dollar sign in front of it, and someone asks you, the person who signs it, whether that is a good price. You have no idea. Nobody in the room does, because the demo measured the wrong thing. It measured how good the vendor is at demos.
Here is the trap, stated plainly: a demo runs on data that was already clean, permissions that were already set, and a workflow that was already decided. Your business has none of those things waiting for it. The cost of an AI engagement lives almost entirely in the gap between the demo's tidy sandbox and your actual Tuesday. So when a proposal collapses that gap into one blended number, it is not simplifying the decision for you. It is hiding the decision from you.
Three large bodies of evidence say the same thing about where value actually comes from, and none of them mention demo quality. McKinsey's State of AI research ties returns to workflow redesign rather than tool adoption. IBM's Institute for Business Value work points to ownership and capability inside the organization. PwC's Responsible AI survey ties durability to controls and governance. Redesign, ownership, adoption, controls. That is the work. A demo shows you none of it, which means a proposal priced to match a demo is priced to match the wrong thing.
Make them itemize, then read where they got nervous
The fastest way to convert a one-number quote into a real one is to ask for it broken into the work that has to happen regardless of which model wins: diagnosis, workflow redesign, getting your data into a usable state, integration into the systems people already live in, governance and access controls, the build itself, training the staff who have to use it, and the measurement that proves it worked. Send that list back and ask for a number against each line. What comes back tells you more than the total ever could.
Watch two lines in particular, because they are where vendors who only know how to demo tend to flinch. The first is data readiness. A demo never shows you the three duplicate customer records, the field someone has been using for the wrong purpose since 2021, or the spreadsheet that lives on one person's laptop. That cleanup is real money and real calendar time, and a vendor who waves it off has either not looked at your data or is planning to surprise you with a change order. The second is access and auditing. Microsoft's documentation on Copilot architecture, data protection, and auditing lays out how an AI assistant inherits whatever permissions already exist, which means messy permissions let the assistant surface things it shouldn't to people who shouldn't see them. That is a cost and a liability whether or not your stack is Microsoft. The NIST AI Risk Management Framework exists precisely because this layer is work, not a checkbox.
So ask two blunt questions and write down the answers. Which assumptions, if wrong, change this price the most? And what would you find in our environment that would make you pause the build? A consultant who has actually done this names source-data cleanup, exception handling, monitoring, and staff adoption without hesitating. A consultant selling you a demo changes the subject back to the demo.
Approve one workflow with a number on both ends
Do not authorize a broad engagement off a presentation. Authorize one workflow that has a measured starting point and a target you agreed to in advance. Say you run a 40-person B2B services firm and quoting is the bottleneck. Before anyone builds anything, write down today's truth: a quote takes four days on average, twelve are sitting in the queue past their service window, and one in six goes out with a pricing error someone has to walk back. Now you have a baseline. The proposal's job is to move those three numbers, and you will know in weeks rather than guessing for a year.
Good baselines are boring and specific on purpose: cycle time, backlog aging, error and rework rate, handoff misses, customer response time, how fast revenue gets followed up on, and the one most teams forget, actual staff adoption, because a tool nobody opens has a return of zero no matter how clean the build. Pick the one workflow where the pain has a dollar figure attached, measure it cold, then let cost be judged against that movement instead of against a slideshow.
Two things help you do this before you sign anything. Use our breakdown of AI consulting cost to set fair expectations for what each tier of work should run, and run your candidate workflow through the AI ROI calculator to see the return on paper before you commit budget. Walk into the next vendor conversation already knowing your baseline and your target. Then the demo becomes what it should have been all along, a nice-to-have, not the basis for a six-figure decision.