The Build-vs-Buy Framework for AI Features

Every AI feature decision starts with "should we build this or buy it?" The answer is almost never purely one or the other.

The three options

API call (buy): Use Claude, GPT, or a specialized AI API directly. Fastest to ship, lowest control.
Fine-tune (customize): Take a base model and train it on your data. Middle ground on speed and control.
Build from scratch (own): Train a custom model on your data for your specific task. Slowest, most control.

The decision tree

Is the task generic? (summarization, translation, general Q&A) → Use an API. Don't fine-tune for tasks that frontier models already do well.

Do you need specific behavior? (always output a certain format, use domain terminology, follow strict rules) → Try prompting first. If prompting isn't reliable enough, fine-tune.

Is latency critical? (real-time, <100ms) → Consider a small, specialized model. API calls add 200-500ms of network latency.

Is data privacy non-negotiable? (healthcare, defense, financial PII) → Self-hosted model or on-premise deployment. No API calls to external services.

Do you have 10,000+ labeled examples? → Fine-tuning becomes viable. Below that, RAG + prompting usually wins.

The cost reality

API: $0.01-0.10 per request. Scales linearly. Cheapest at low volume, expensive at high volume.
Fine-tune: $500-5,000 per training run. Cheaper per request once deployed, but ongoing evaluation costs.
Custom model: $50,000+ in engineering time. Only makes sense at very high volume or with unique requirements.

Our rule of thumb

Start with the API. Only move to fine-tuning when you can prove the API isn't meeting your accuracy or latency requirements. Only build custom when you have the data, the team, and the scale to justify it.