Skip to content
Contact Us
Technical Debt4 min

The Margin That Wasn't There: Auditing AI Vendor Dependency Before You Sign

A SaaS target's 82% gross margin can hide a single-vendor API bill that quietly halves it. How to diligence AI dependency, model drift, and COGS before LOI.

Abstract representation of AI API connections breaking under the weight
of financial costs and technical debt.
Figure 01 Abstract representation of AI API connections breaking under the weight of financial costs and technical debt.
Answer summary

The practical answer

Short answer
A SaaS target's 82% gross margin can hide a single-vendor API bill that quietly halves it. How to diligence AI dependency, model drift, and COGS before LOI.
Best fit
Industry: SaaS & Technology M&A. Function: Technology Due Diligence
Operating path
Technical Debt -> Turnaround & Restructuring -> Transaction Advisory Services -> Valuations
Key metric
70% Proportion of an AI model's lifetime costs driven entirely by ongoing inference and compute expenses.

The line item the model never showed you

On a recent $250M SaaS carve-out, the data room looked clean: 82% software gross margin, SOC 2 done, licenses tidy, a "flagship AI copilot" splashed across every sales deck. The copilot was the whole equity story. So I asked the one question the CIM never answers: what does that copilot cost to run per active user per month? Nobody on the management team had the number. We built it ourselves — adoption curves against raw inference volume against the target's actual per-token contract — and the 82% margin slid to 47% four quarters into the rollout the seller was projecting. The growth was real. The margin underneath it was rented.

This is the trap specific to AI-era diligence: SaaS spent a decade engineering near-zero marginal cost, and generative AI quietly dragged variable COGS back onto the P&L through the side door. Buyers still model token spend like a rounding error. Bain's analysis of SaaS unit economics under AI adoption documents the shape of it — one marketing-tech firm grew revenue 38% while its AI infrastructure and hosting costs ran up 349%. That is not a company scaling. That is a company funding its own customer growth out of margin.

The root cause is almost always architectural and almost always invisible in a deck. Founders racing to ship wire product features straight into a single vendor's endpoint with no abstraction in between, which hands pricing power and behavior control entirely to the model provider. TrueFoundry's breakdown of generative AI cost drivers puts inference at the dominant share of a model feature's lifetime cost — roughly 70% of what you'll spend over its life is the ongoing bill, not the build. If the target can't route that inference across providers or pull it in-house, the "proprietary AI" you're paying a premium for is someone else's compute resold at a structural loss.

Two failures that never appear in the data room

Cost is the failure you can model. The two that wreck deals post-close are the ones management has never had to write down. The first is drift: hardcode a feature to a specific model version and you've adopted that vendor's roadmap as your own. A prompt chain that returned clean structured output in March starts refusing or hallucinating in April — not because anyone touched the code, but because the provider tuned a guardrail over a weekend and didn't send a memo. The second is deprecation: the version your product depends on gets a sunset date, and now your engineering roadmap belongs to a vendor's lifecycle calendar, not yours.

Surviving both takes real orchestration discipline, and most mid-market engineering teams simply don't have it. Operator guidance on moving AI agents from demo to production describes the pattern bluntly — projects stall in pilot because single-model dependency, surprise deprecations, and no testing harness make the thing impossible to operate. Roughly 85% of AI initiatives never scale past that pilot wall, and almost none of those companies disclose it as risk. They disclose it as "still optimizing."

Watch the second-order damage, because it hits the number you're actually buying: net revenue retention. CIO's reporting on the inference bill nobody budgeted for shows costs climbing faster than measurable outcomes when teams skip the operating model entirely. When a feature breaks every time the upstream model shifts, customer success burns its week apologizing for bad outputs instead of driving expansion — and that churn shows up two quarters after you've already paid for the growth. The technology due diligence red flags that kill deals increasingly trace back to exactly this dependency, not to anything in the code-quality scan.

A technical architecture diagram contrasting a brittle single-API
connection with a resilient, multi-model AI Gateway.
A technical architecture diagram contrasting a brittle single-API connection with a resilient, multi-model AI Gateway.

The 48-hour test, and what to demand before LOI

Static code review, license sweep, SOC 2 — necessary, no longer sufficient. The diligence question that actually prices AI risk is portability: how fast can this product swap its foundational model without rewriting core logic? My threshold is 48 hours. If the team can't credibly demonstrate a provider swap inside two days, the dependency is structural, the margin is rented, and the premium multiple is unjustifiable. EthosData's overview of AI in M&A diligence makes the same point — diligence quality drops sharply when technical dependencies aren't inspected at depth, and that gap only widens as API-built targets flood the deal pipeline.

The architecture that earns the premium is a gateway, not a wrapper — an abstraction layer that routes each request across models by cost, latency, and compliance. Say a 60-person legal-tech company: route bulk internal document parsing to a self-hosted open-weight model for pennies, route the genuinely hard multi-step reasoning to a premium commercial API, and route around any provider that drifts or rate-limits. That routing is the only mechanism that decouples the product's unit economics from one vendor's margin requirements. A target that has it has a margin floor. A target that doesn't has a vendor's permission slip.

So before LOI, make the CTO show you the artifacts, not the partnership slide. Demand a full map of external AI dependencies. Ask them to run a live model swap in the room. Make them show prompt version control, automatic fallback on rate limits and outages, and regression tests built specifically to catch drift. If you get a blank look and a deck about "our deep OpenAI partnership," you're buying a proof-of-concept wearing a product's clothes. Run our technology due diligence checklist for software acquisitions to force these conversations while you still have the leverage to reprice — because the COGS reality always arrives, and it never arrives on your schedule.

Continue the operating path
Topic hub Technical Debt Quantification in dollars, not adjectives. Then a remediation plan that runs in parallel with delivery. Pillar Turnaround & Restructuring Technical debt is real money. Once you can name it as a number — its impact on velocity, EBITDA, and exit multiple — it stops being a vague engineering complaint and becomes a board agenda item. Service Transaction Advisory Services Operator-led buy-side and sell-side diligence for technology middle-market deals. Financial rigor, technical diligence, and integration risk in one workstream. Service Valuations Credible valuation work for SaaS, services, IP, ARR/MRR, cap tables, and exit readiness in technology middle-market transactions. Service Performance Improvement Revenue, margin, delivery, technical debt, and operating-system improvement for technology firms with stalled growth or compressed EBITDA.
Related intelligence
Sources
  1. Bain analysis of SaaS unit economics under AI adoption
  2. TrueFoundry analysis of generative AI cost drivers
  3. Operator guidance on moving AI agents from demo to production
  4. CIO reporting on AI inference costs
  5. EthosData overview of AI in M&A diligence
Move on this

A 14-day operator-led diagnostic, before the gap is priced into your multiple.

No retainer until we agree on the work.

Request a Turnaround Assessment →