Start where the work is clerical, not where it's legal
Picture the Friday afternoon write-down at a 25-lawyer firm: a partner staring at a 40-line bill, slashing "reviewed correspondence and conferred re: strategy" entries that a client will reject, rewriting time narratives the associate dashed off three weeks ago. That hour is the one to give to AI first. Not the brief. Not the advice memo. The billing narrative cleanup, the matter-intake form that's missing the adverse party's full legal name, the internal summary of a document set nobody has had time to read.
The reason is precise: those tasks have a human who can see every input and check every output, and they don't ask the model to form a legal conclusion. The Thomson Reuters 2026 AI in Professional Services report and Deloitte State of AI in the Enterprise 2026 both describe the same squeeze on legal services: real adoption pressure colliding with privilege, confidentiality, and the bar's discomfort with anything that looks like outsourced judgment. That tension doesn't get resolved at the firm level with a policy memo. It gets resolved one workflow at a time, by choosing tasks where the worst-case error is a typo, not a waived privilege.
The trap isn't a weak summary — it's the conflicts check you didn't run
Here's the failure most firms walk into. The pilot "works." Intake summaries come back clean, billing narratives read better, the associates love it. Then someone feeds a new-matter packet into the same tool — and the model has now seen, and possibly logged with a vendor, the identity of a prospective client the firm hasn't run for conflicts yet. Or an intake summary quietly pulls in a paragraph from a co-defendant's privileged file because both matters live in the same folder. The output looked fine. The boundary it crossed was invisible until it wasn't.
So measure the boundary, not the prose. Before you start, write down the baseline you actually care about: matter-opening rework, document-classification errors, the minutes per bill spent on narrative cleanup, and the count of intake items kicked back for missing context like a full party name or jurisdiction. Then run a genuinely boring weekly review of four numbers — attorney approvals, privilege-sensitive exceptions caught, missing or wrong source citations, and any client-facing material held for rewrite. If the firm is generating more first drafts but the rework and exception counts haven't moved, you bought a draft machine, not leverage. Only once those measures have a named owner should you reach for the AI Opportunity Score or the AI ROI Calculator to size the next move.
Govern by the matter, because that's how privilege works
A law firm's access model is already organized the right way — by matter, with conflicts walls — and your AI workflow has to inherit that, not flatten it. The NIST AI Risk Management Framework gives you the spine: write down the intended use, the risk, how you'll measure it, and who's accountable when it goes sideways. The CISA AI data-security best practices tell you how to lock down where confidential records sit, what gets retained, and what a vendor is allowed to keep.
Concretely: enforce matter-level permissions so the tool can't read across an ethical wall, keep every output traceable back to the source document it summarized, require attorney sign-off before anything touches a client or a court, and hard-stop the tool out of any matter where a conflict or privilege question is open. Run a single matter type cleanly for a quarter. Prove confidentiality held, the citations checked out, and the attorneys actually trust it — then, and only then, extend the same pattern to an adjacent internal routine. Scaling an ungoverned pilot doesn't multiply the upside; it multiplies the matters at risk.