The demo answers a question you'll never ask in production
Picture the pitch. The consultant types "What's our standard indemnification clause?" into a clean interface, and a tidy paragraph appears in two seconds. Everyone nods. The problem is that nobody at your firm asks questions that clean. A senior associate asks "Can we use the 2024 MSA language for a client in a regulated industry, or did we change it after the data-privacy update?" That question has a date, a condition, and a version dependency baked into it. The demo never has to survive any of that.
For a professional services or B2B services firm, a knowledge assistant lives or dies on one thing: whether it pulls from the right version of the right document, scoped to who's allowed to see it. A polished retrieval interface tells you almost nothing about that. The RSM middle-market AI survey, the OECD report on AI adoption by small and medium-sized enterprises, and the Deloitte State of AI in the Enterprise 2026 converge on the unglamorous part of the work: workflow ownership, data readiness, and a path from a sandbox to something a real team uses on a Tuesday.
So before anyone schedules a demo, hand the consultant one of your actual documents — a master services agreement, an engagement letter, a methodology doc — and ask: where does this live, who's allowed to read it, which version is current, and what happens when we update it next quarter? The answer to those four questions predicts the project. The demo predicts nothing. Start with the AI readiness assessment buyer guide before you sit through a single slide.
The dangerous answer is the wrong one delivered with confidence
A knowledge assistant doesn't fail loudly. It fails when it retrieves a contract template you retired eighteen months ago and presents it as current, or when it surfaces a partner's confidential client memo to an analyst who was never cleared to see it. Neither of those shows up in a demo, because the demo dataset is curated, single-version, and permission-flat. Your real document store is none of those things — it's a decade of overlapping folders, three naming conventions, and access rules that live in someone's head.
This is where the budget conversation gets honest. The license is the cheap part. The expensive part is the work that makes answers trustworthy: deduplicating and retiring stale source documents, mapping your existing permission groups onto what the assistant is allowed to retrieve, building a way for someone to flag a wrong answer, and keeping a log of what the system said so you can audit it later. The NIST AI Risk Management Framework and CISA AI Data Security Best Practices exist precisely because this layer is where firms get burned. A consultant who hasn't priced source cleanup and permission design is quoting you a fraction of the real number.
Two questions separate the serious partner from the reseller. First: "When a document is confidential to one client team, how does the assistant know not to answer from it for everyone else?" If the answer is "we rely on the underlying tool," push harder. Second, on the data itself — review OpenAI Enterprise Privacy or the equivalent for whatever model sits underneath, and confirm what gets submitted, what's retained, and who can approve production use. Then sanity-check the proposed timeline against the 90-day AI implementation plan instead of taking the roadmap deck at face value.
Score the work on one production workflow, not the strategy deck
Here's the test that cuts through everything. Ask the consultant to name a single workflow — say, an associate drafting a first-pass engagement letter, or support staff answering "what's our policy on X" — and then ask: what will be true at the end of that workflow's first live month that isn't true today? A strong answer is concrete and uncomfortable: "Drafting time on standard engagement letters drops from two hours to forty minutes, every answer cites the source document and version, and we have a weekly review where a partner spot-checks ten responses and flags any that pulled from the wrong source." A weak answer is a maturity model and a phased vision.
Notice what that good answer contains. A named owner. A baseline you can measure. A citation requirement so the assistant shows its work. A human review cadence. And a stop rule — the courage to say "if accuracy is below X after thirty days, we pause and fix the sources before we expand." For a professional services firm, where a wrong answer can travel into a client deliverable, that review loop isn't bureaucracy. It's the entire point of doing this carefully.
What you should walk away with is not a tool comparison and not a transformation narrative. It's one accountable workflow, one owner, and a number you can defend in a partners' meeting. To keep that number real and resist the temptation to claim savings nobody can trace, use AI ROI measurement without fake savings as your guardrail — and when you're ready to sequence the whole effort, that's what the AI roadmap is for.