Corporate prompt engineering: what changes when the agent goes to production
A prompt that works in demo dies in production. Patterns tested in real Gemini Enterprise agents — structure, guardrails, few-shot, uncertainty handling and versioning.
Fabiano Brito
CEO & Founder
"Prompt engineering" became a meme in 2024 — "anyone can do it". In corporate production, it's what separates a reliable agent from an embarrassing one. This post brings the patterns we apply in all Autenticare projects.
Corporate prompt structure — the 7 blocks
Every production prompt has 7 blocks, in this order:
Who the agent is and what its scope is. Without this it assumes "general assistant".
Tone, values, brand restrictions. This is where "we" becomes voice.
What the agent can do and, most importantly, what it cannot do.
How to react when it doesn't know. The most underestimated block in corporate prompts.
JSON or text structure, mandatory citations, size limits.
2–5 examples of good behavior, including 1 "I don't know" example.
Clear list with when to use each tool and the expected schema.
Without any of these blocks, behavior degrades in non-obvious cases.
The most underestimated block: uncertainty rules
The LLM default is to appear confident even when it doesn't know. In production, this is disguised hallucination. Always include literally:
"If the required information is not in the retrieved context, respond 'I did not find that information in the available base' — do not invent, do not generalize from your own knowledge. If the question is ambiguous, ask for clarification before responding."
In cases where the agent is certain, it responds. In those where it isn't, it escalates to a human. This drastically reduces hallucination. More at evaluation of agents in production.
Few-shot: how to choose examples
Poorly chosen few-shot biases worse than no examples. Criteria:
- Diversity: cover the 3–5 most common patterns, not 5 variations of the same one.
- Edge cases: include 1 example of "I have no information" and 1 of "I need clarification".
- Mirrored format: each example in the exact expected response format.
- Human-curated: never use LLM outputs as few-shot — it becomes a bias echo.
Patterns that work × anti-patterns
| Recommended pattern | Anti-pattern to avoid |
|---|---|
| Positive constraint ("respond in up to 3 paragraphs") | Negative constraint ("don't respond too long") |
| Explicit structure ("Use headings: Summary / Context / Recommendation") | "Be clear and organized" |
| Mandatory citation ([doc:page] at the end of each statement) | "Include sources when possible" |
| Explicit PII masking (CPF → ***.***.***-12) | "Avoid sensitive data" |
| Self-check before responding | Direct response without review |
Dates in ISO 8601 (2026-04-20) |
"This week", "last month" |
| Explicit language ("Brazilian vocabulary, avoid PT-PT") | Let the model choose the variant |
Versioning: prompt is code
A production prompt is code. Minimum treatment:
- Dedicated git repository, with PR and review.
- Each version has hash + author + date + motivation.
- A/B test before promoting to 100%.
- Automated evaluation against gold set on every PR.
- Rollback in one command.
Without this, "someone changed the prompt" becomes a production nightmare.
Model: Pro vs Flash in the same agent
Efficient production pattern:
- Gemini 2.5 Flash: classification, routing, short tasks, schema validation.
- Gemini 2.5 Pro: complex reasoning, main generation, heavy multimodal.
Cost drops 60–80% without perceived quality loss — the user gets Flash for the trivial 70% and Pro for the 30% that matters.
Guardrails beyond the prompt
Prompt alone is not enough. Combine with:
- Input validation: size limits, command sanitization.
- Output filter: regex/classifier for PII, prohibited content.
- Tool authorization: each tool has its own ACL.
- Rate limit: per user and per agent.
- Confidence threshold: below X, escalates to human.
Is your production agent running a versioned prompt?
We audit the current prompt, restructure it into 7 blocks, add guardrails and configure the gold set. Delivery in 2 weeks.
