Email triage with an agent: how an insurer processes 8,000 messages/day
Claims mailbox received 8,000 emails/day with broken SLA. Gemini Enterprise agent classifies, extracts data, opens claim in core system and replies — in 90 seconds per email.
Fabiano Brito
CEO & Founder
claims@ inbox
promised SLA: 4h
before vs after
Email triage is the most underestimated use case in enterprise AI. It sounds simple — "read the email and classify it". At real scale, with 6 dialects of business rules, 12 insurance lines and heterogeneous attachments, this is where 80% of pilots die from underestimating complexity.
This case shows what works when the problem gets the respect it deserves.
Starting point
- Central
sinistros@mailbox receives 8,000 emails/day. - 20 human triagers classified, routed or requested complements.
- Average classification time: 18 hours (promised SLA: 4h).
- Attachments: PDFs (police report, medical report, photos), images, forwarded WhatsApp audio.
- Chronic Monday backlog (~14k messages).
- Recurring Procon complaint about delays.
What the agent does
- Receives the email via Gmail API + webhook.
- Reads body + all attachments (Gemini 2.5 Pro multimodal: PDF, image, audio).
- Identifies the line (auto, home, life, health, liability, equipment).
- Identifies the policyholder: cross-references mentioned CPF/policy with the core.
- Classifies the type: new claim notice, complement to existing claim, question, complaint, spam.
- Extracts structured data: event date, location, description, estimated value, attached documents, witness data.
- Decides the next action:
- New claim with complete data → opens claim in core.
- New claim with missing data → replies with checklist of what is missing.
- Complement → links to existing claim.
- Question → routes to customer service with context.
- Complaint → escalates to ombudsman with severity classification.
- Replies to the policyholder in clear PT-BR with protocol number.
- Logs everything: original email, classification, actions, sent reply.
Architecture
- Gemini Enterprise Plus: orchestration + model access.
- Gemini 2.5 Pro: multimodal reading (email + attachments).
- Gemini 2.5 Flash: fast classification and routing (cost fallback).
- Vertex AI Search: knowledge base (underwriting policies, product manuals, anonymized prior decisions).
- Tools:
- Policyholder lookup (CPF/policy → data).
- Claim opening in core (SOAP via Apigee).
- Complement linking.
- Reply via Gmail.
- Routing to ombudsman/customer service.
- Cloud Run: webhook + retry logic.
- BigQuery: structured log for analysis.
Results in 60 days
| Metric | Before | After | Delta |
|---|---|---|---|
| Average classification time | 18 h | 90 s | −99.8% |
| Volume processed | 8,000/day | 8,000/day | = |
| Automatic claim opening | 0% | 73% | +73 pts |
| Cases still requiring human | 100% | 27% | −73 pts |
| SLA met | 34% | 97% | +63 pts |
| Procon complaints (3 months) | 142 | 38 | −73% |
| Human triagers | 20 | 8 | 12 reallocated |
The 12 reallocated triagers moved to complex claim analysis (potential fraud, multi-victim cases), an area previously outsourced. Full project ROI: 4 months.
What worked — and why
1. True multimodal
OCR was not enough. Smudged police report in a cell phone photo, witness audio in WhatsApp — Gemini 2.5 Pro reads it all directly, without a separate transcription pipeline.
2. Human-sounding reply
We invested in prompt to avoid robotic tone. Each reply names the policyholder, paraphrases the event (shows it understood), lists what is needed, gives protocol and deadline. Policyholder NPS rose 24 pts.
3. Classification with numeric confidence
The agent returns confidence (0–1) per category. Below 0.85, goes to human. Calibrated with 500 real cases. Drastically reduces auto classification error.
4. Closed learning loop
Each case the human corrects becomes an example in the gold set. We re-evaluated weekly for the first 8 weeks. Per-category recall rose 11 pts in the period.
What went wrong — and how we worked around it
Huge attachments
200-page PDF (medical report) blew the token limit. Solution: pre-summarize by chunks with Gemini Flash before sending to the main agent.
Sophisticated spam
Fake insurance billing landed in the inbox. We trained a specific spam/phishing classifier as the first pipeline stage.
Heavy-accent audio
Gemini 2.5 Pro improved a lot in regional PT-BR but still errs. When transcription confidence drops, the agent politely asks the policyholder to write or call.
Email chains
15-reply threads became confusing context. We added thread summarization as pre-processing.
The NPS gain from policyholders (+24 pts) surprised us more than the ROI: a human-sounding reply, immediate protocol and a clear deadline beat 18 hours of silence.
Replicability
The pattern works for any high-volume mailbox with complex rules: claims, industrial customer service, HR (onboarding), legal (subpoenas), bank back office. We detail the financial vertical in Gemini Enterprise for financial services.
Does your inbox receive > 1,000 emails/day with complex rules?
30-minute diagnostic: volume, current SLA, attachments, line rules. Leaves with ROI estimate and 60-day plan. Replicable pattern for claims, industrial customer service, HR, legal.
