Build vs buy: the AI math behind the decision most teams get wrong

The build-vs-buy math is wrong because the cost model is wrong

Walk into any AI vendor conversation and the cost case is the model API price times expected token volume. That is the line on the slide. The lines missing from the slide are: integration engineering, ongoing eval set maintenance, observability infrastructure, the ops team that runs production AI, the human review queue when the model is uncertain, retraining or refresh cycles as data drifts, security and compliance reviews, and the procurement overhead of vendor management.

When all the lines are accounted for, the steady-state cost of buying versus building shifts materially. Sometimes the buy decision still wins; sometimes it does not. The point is that the original analysis was making the comparison on incomplete data, which means the decision was based on an answer to a different question.

Build-vs-buy is rarely binary; it is usually 70/30

The framing 'build or buy' implies a single answer for the whole capability. The actually useful framing is which 70% should we buy and which 30% must we build. Buy the foundation — frontier models, vector databases, observability platforms, eval frameworks. Build the parts that are specific to your domain, data, and integration surface — retrieval over your corpus, fine-tunes on your tasks, the application logic that wires the components into your operations.

Vendors that sell whole-stack solutions push back on this framing because it shrinks their footprint. The framing is correct anyway. Whole-stack vendors that own everything from infrastructure to application produce strategic dependencies that are expensive to undo. Modular buy-and-build retains optionality.

TCO drift surfaced: 1.6–2.4x avg vs initial budget
Build-vs-buy flips: ~30% after honest math
Whole-stack vendor risk: High strategic dependency
Modular split, common: 70/30 buy / build

Integration depth changes the math more than capability does

A vendor's capability sheet usually wins the slide war. The capability sheet rarely captures integration depth — how the vendor's system actually plugs into the customer's data, identity, observability, and operations stack. A vendor that scores 95 on capability but requires custom integration work on every system it touches is more expensive than a vendor that scores 85 and integrates cleanly.

We score integration depth as a separate axis on the framework. SSO compatibility, audit log integration, data residency options, IDP federation, observability hooks, deployment topology — all of these affect the build cost of buying. A vendor that requires four custom integrations is partially a build project regardless of the buy contract.

Sovereignty requirements often decide the answer

When the workload involves data that cannot leave the enterprise perimeter — regulated financial data, healthcare PHI, classified content, M&A-relevant material — the buy options narrow sharply. Most vendor SaaS is multi-tenant and routes through the vendor's infrastructure, which fails the sovereignty test for these workloads. The remaining options are vendor on-premises deployment (often expensive and operationally complex), self-managed open-source equivalents, or building.

Sovereignty is binary on a per-workload basis. A capability that the enterprise can run via SaaS for one workload may need to be self-hosted for another. The build-vs-buy answer can vary across workloads of the same enterprise; a uniform answer is usually a mistake.

Team leverage decides where building actually pays off

The 'we should build it' argument is sometimes valid and sometimes hubris. The honest test: where does your team have leverage that a vendor does not? Domain expertise, proprietary data, integration depth into existing systems, knowledge of internal processes — these are leverage. General-purpose AI infrastructure that twenty vendors build at scale is not leverage.

Building where you have leverage produces a competitive moat. Building where you do not produces an undifferentiated cost center that gets out-competed by vendors continuously investing in the same problem. The framework prompts the question explicitly: what is your team uniquely positioned to build, and what would you be reinventing?

Strategic dependency tolerance is the framework's last factor

Every buy decision creates a strategic dependency. The question is how tolerant the enterprise is of that dependency for that capability. Buying email service from a major vendor is a low-risk dependency. Buying the AI capability that drives 30% of customer interactions is a higher-risk dependency, even if both contracts are similar in size.

Tolerance varies by industry and by leadership posture. Regulated industries usually have lower tolerance; defense and intelligence have very low tolerance; consumer SaaS often has high tolerance. The framework asks the question explicitly so the answer is a deliberate choice rather than an inheritance from the vendor's contract.

What the framework looks like as a deliverable

On a consulting engagement, the build-vs-buy analysis is a written brief: capability decomposed into modules, each module scored on the four factors, total cost of ownership over 36 months for both build and buy paths, sensitivity analysis on the variables that move the decision, and a recommendation with the math behind it. Twenty to thirty pages, signed off in a working session with the executive sponsor and technical leadership.

The deliverable is not the recommendation — it is the math. If the recommendation surprises the executive sponsor, the math is the artifact that makes the surprise actionable. If the recommendation confirms the original direction, the math is the artifact that makes the confidence durable.

We were going to buy the whole platform from one vendor. The framework surfaced that we had genuine leverage on the retrieval layer because of our document corpus, and the vendor's retrieval was generic. We bought the model layer, the observability, the eval framework — and built our own retrieval. Two years in, the retrieval is the part that competitors cannot match. We would have rented the most valuable part of the system if we had not done the analysis.
— Chief Digital Officer, large insurance carrier

Frequently asked

Why is the typical build-vs-buy analysis wrong?

Because the cost model usually captures the model API price and forgets the rest: integration engineering, eval maintenance, observability, the ops team, human review queues, retraining cycles, security and compliance reviews, vendor management. When all the lines are accounted for, total cost of ownership commonly runs 1.6 to 2.4x the original budget, which can change the answer. About 30% of the build-vs-buy decisions we audit flip after honest math.

Should I really build it, or just buy from a vendor?

Usually neither alone — most enterprise AI capabilities split 70/30. Buy the foundation (frontier models, vector databases, observability platforms, eval frameworks). Build the parts that depend on your domain, your data, your integration depth. Whole-stack vendors push back on this because it shrinks their footprint, but the framing produces better outcomes and retains optionality. Modular split is the right default.

How does sovereignty affect the build-vs-buy decision?

When the workload involves data that cannot leave the enterprise perimeter — regulated finance, healthcare PHI, classified content — most vendor SaaS fails the sovereignty test. The remaining options are on-premises vendor deployment, self-managed open-source equivalents, or building. Sovereignty is per-workload, so the same enterprise may have different build-vs-buy answers for different workloads. A uniform answer is usually a mistake.

What is the role of integration depth in the framework?

Capability sheets win slide wars but rarely capture integration depth — how the vendor's system actually plugs into your data, identity, observability, and operations stack. We score integration as a separate axis. A vendor that scores 95 on capability but requires four custom integrations is partially a build project regardless of the contract. Integration depth often changes the answer more than raw capability.

When does building actually produce competitive advantage?

When your team has leverage a vendor cannot replicate — domain expertise, proprietary data, integration depth into existing systems, knowledge of internal processes. Building where you have leverage produces a competitive moat. Building where you do not produces an undifferentiated cost center that gets out-competed by vendors investing continuously. The honest question is what your team is uniquely positioned to build versus what would be reinventing infrastructure.

What does the build-vs-buy deliverable look like on a consulting engagement?

A 20–30 page written brief: capability decomposed into modules, each scored on the four factors (TCO, integration, sovereignty, leverage), total cost of ownership over 36 months for both paths, sensitivity analysis on the decision-moving variables, and a recommendation with the math. Delivered with a working session for the executive sponsor and technical leadership. The math is the artifact that makes the recommendation durable, whether it surprises or confirms.

Build vs buy: the AI math behind the decision most teams get wrong

The build-vs-buy math is wrong because the cost model is wrong

Build-vs-buy is rarely binary; it is usually 70/30

Integration depth changes the math more than capability does

Sovereignty requirements often decide the answer

Team leverage decides where building actually pays off

Strategic dependency tolerance is the framework's last factor

What the framework looks like as a deliverable

Frequently asked

More from Field Notes

The second-opinion engagement: when a fresh set of eyes saves the program

Vendor RFP scoring without referral fees: independent technical evaluation

Operating model design: org chart, hiring plan, RACI, and tooling spine