← Back to AI Insights
LLMs
Practical · 4 min read
Prompt Engineering Is Software Engineering Now
Version control your prompts. Test them. Review them in PRs. We treat prompts like code — because they are.
A year ago, prompt engineering was copy-pasting from Twitter threads. Now it's the most impactful part of our LLM applications — and we treat it with the same rigor as code.
How we manage prompts
- Version controlled. Every prompt lives in a file, in Git, with a changelog. No prompts in code strings.
- Tested. We have a test suite of 50+ input/output pairs for each critical prompt. When we change a prompt, we run the suite.
- Reviewed. Prompt changes go through PR review, just like code changes. A teammate reviews the intent, structure, and test results.
- Documented. Each prompt has a README explaining what it does, what edge cases it handles, and what it's known to fail on.
Prompt architecture patterns
- System → Context → Task → Format. This four-part structure works for 90% of our use cases. System sets the role, context provides the background, task defines the action, format specifies the output.
- Few-shot over zero-shot. 3-5 examples in the prompt consistently outperform instructions alone. Pick examples that cover edge cases, not just the happy path.
- Chain of thought for complex tasks. Ask the model to reason step by step for multi-part tasks. Not for simple extraction — it just adds latency without value there.
The testing problem
LLM outputs are non-deterministic. You can't write assertEqual tests. Instead, we use:
- Schema validation — did the output match the expected JSON structure?
- Keyword checks — does the output contain required fields or phrases?
- LLM-as-judge — use a second model to evaluate whether the output is correct (surprisingly effective for subjective quality)
Written by the Xceed AI team. Talk to us →
