The 40-day invoice nobody was working
Picture the aging report at a 90-person B2B software company. There's a $60K invoice sitting at 47 days. It isn't disputed. The customer isn't broke. It's late because the renewal got re-papered in March, the new PO number lives in a Slack thread, the account manager who knew the contact left in April, and the controller chasing it has 80 other lines to work and zero context on this one. So it ages. Then it ages into the next bucket. That is what a bad DSO number is actually made of — not deadbeats, but invoices no one had the context to work in time.
This is why collections reads like a people problem and behaves like a workflow problem. The information needed to make one good follow-up call is scattered across the ERP, the CRM, a shared inbox, a contracts folder, and the memory of whoever owned the account last quarter. McKinsey's read on how finance teams are putting AI to work keeps landing on the same pattern: the wins come from assembling and routing context, not from generating clever output. Trapped working capital is real money — PwC's Working Capital Study has tracked hundreds of billions sitting idle on balance sheets, much of it recoverable through nothing more exotic than chasing the right invoice on day three instead of day thirteen.
So the useful version of AI here is narrow and unglamorous. For each open invoice it pulls the history, checks the customer's standing, summarizes the last three exchanges, flags the missing PO or the open support ticket, drafts the reminder, and routes anything that smells like a dispute to a human. It does not autonomously chase your top-five logo because a date crossed a threshold. Before you build any of it, size the prize with the AI ROI Calculator — research minutes per account times open invoice count is usually a bigger number than people expect.
Sort your accounts before you automate a single one
The fastest way to wreck a customer relationship with AI is to treat every overdue line the same. A 12-day-late invoice from a happy enterprise account that's mid-renewal needs a featherweight nudge to the right contact. A 12-day-late invoice tied to an open delivery complaint needs to go nowhere near an automated reminder — it needs the account exec and probably an apology. Same age, opposite play. If your workflow can't tell those two apart, it will torch goodwill at scale.
So the first design move isn't prompts — it's sorting. Sit with your best collections person and map the buckets they already work by instinct: routine slow-payers who respond to a polite ping, accounts where the relationship outweighs the invoice, invoices blocked by something you owe the customer, and accounts that should never see a templated message with your logo on it. Those buckets become the routing logic. AI handles preparation and the low-stakes nudges; humans keep the relationship-sensitive and disputed lines. Gartner expects embedded AI in cloud ERP to drive a 30% faster financial close by 2028 — but that speed only materializes if the routing is sound, because the close gets slower, not faster, when someone has to walk back an automated message that went to the wrong stakeholder.
Two non-negotiables sit underneath all of this. First, every drafted message shows its work — which invoice, which contact, which terms, what it's unsure about — so the reviewer approves a decision rather than rubber-stamping a black box. Second, the underlying data has to be trustworthy. If your contact fields, payment terms, and account-owner mappings are stale, automation will email the departed AP clerk on day one and carry the wrong DSO assumptions forever. That's the same hygiene problem covered in AI CRM cleanup: fix the data first, or you'll scale the errors faster than the collections.
90 days: prove it on one customer segment, then widen
Don't boil the whole aging report. Days 1 to 30 are for picking one clean segment and instrumenting it. Choose a customer group where payment terms are unambiguous, account ownership is current, and disputes are rare — typically mid-tier accounts on standard net-30 terms, not your top ten and not the chronic problem children. Document the path an invoice currently takes from "due" to "paid," and build the workflow to do the dull part: gather context, score the risk, draft the nudge, kick out the exceptions.
Days 31 to 60 run it in copilot mode with a human approving every outbound message — correcting tone, catching the missing PO, flagging the account that should've been escalated. This is the stretch that tells you the truth. The honest question isn't "did it send emails," it's "did it actually shrink the work, or did we just trade a chasing queue for a reviewing queue?" Track four things on the segment: minutes to prep a follow-up, dollars aging into the next bucket each week, hours from dispute-flag to the right owner, and how often the reviewer overrides the draft. If override rate isn't falling by week eight, your routing logic is wrong, not your model.
By day 90 the low-stakes nudges in that one segment can run on a steady cadence with lighter human touch, while disputes and your relationship accounts stay fully human-owned — and now you have a real before/after on segment DSO to decide whether to extend it to the next bucket. Want a fast read on whether collections is even your highest-leverage starting point? Run the AI Opportunity Score before you scope anything — for some teams cash collection is the obvious first win; for others it's a distraction from a bigger one.