Email triage with an agent: how an insurer processes 8,000 messages/day
Claims mailbox received 8,000 emails/day with broken SLA. Gemini Enterprise agent classifies, extracts data, opens claim in core system and replies — in 90 seconds per email.
Fabiano Brito
CEO & Founder
Email triage with an AI agent is the automated classification, routing, and data extraction of high-volume customer emails and heterogeneous attachments. For enterprises, deploying this technology can slash classification times from 18 hours to 90 seconds and boost SLA compliance from 34% to 97% while reallocating staff to fraud analysis.
claims@ inbox
promised SLA: 4h
before vs after
Email triage is the most underestimated use case in enterprise AI. It sounds simple — "read the email and classify it". At real scale, with 6 dialects of business rules, 12 insurance lines and heterogeneous attachments, this is where 80% of pilots die from underestimating complexity.
This case shows what works when the problem gets the respect it deserves.
Starting point
- Central
sinistros@mailbox receives 8,000 emails/day. - 20 human triagers classified, routed or requested complements.
- Average classification time: 18 hours (promised SLA: 4h).
- Attachments: PDFs (police report, medical report, photos), images, forwarded WhatsApp audio.
- Chronic Monday backlog (~14k messages).
- Recurring Procon complaint about delays.
What the agent does
- Receives the email via Gmail API + webhook.
- Reads body + all attachments (Gemini 2.5 Pro multimodal: PDF, image, audio).
- Identifies the line (auto, home, life, health, liability, equipment).
- Identifies the policyholder: cross-references mentioned CPF/policy with the core.
- Classifies the type: new claim notice, complement to existing claim, question, complaint, spam.
- Extracts structured data: event date, location, description, estimated value, attached documents, witness data.
- Decides the next action:
- New claim with complete data → opens claim in core.
- New claim with missing data → replies with checklist of what is missing.
- Complement → links to existing claim.
- Question → routes to customer service with context.
- Complaint → escalates to ombudsman with severity classification.
- Replies to the policyholder in clear PT-BR with protocol number.
- Logs everything: original email, classification, actions, sent reply.
Architecture
- Gemini Enterprise Plus: orchestration + model access.
- Gemini 2.5 Pro: multimodal reading (email + attachments).
- Gemini 2.5 Flash: fast classification and routing (cost fallback).
- Vertex AI Search: knowledge base (underwriting policies, product manuals, anonymized prior decisions).
- Tools:
- Policyholder lookup (CPF/policy → data).
- Claim opening in core (SOAP via Apigee).
- Complement linking.
- Reply via Gmail.
- Routing to ombudsman/customer service.
- Cloud Run: webhook + retry logic.
- BigQuery: structured log for analysis.
Results in 60 days
| Metric | Before | After | Delta |
|---|---|---|---|
| Average classification time | 18 h | 90 s | −99.8% |
| Volume processed | 8,000/day | 8,000/day | = |
| Automatic claim opening | 0% | 73% | +73 pts |
| Cases still requiring human | 100% | 27% | −73 pts |
| SLA met | 34% | 97% | +63 pts |
| Procon complaints (3 months) | 142 | 38 | −73% |
| Human triagers | 20 | 8 | 12 reallocated |
The 12 reallocated triagers moved to complex claim analysis (potential fraud, multi-victim cases), an area previously outsourced. Full project ROI: 4 months.
What worked — and why
1. True multimodal
OCR was not enough. Smudged police report in a cell phone photo, witness audio in WhatsApp — Gemini 2.5 Pro reads it all directly, without a separate transcription pipeline.
2. Human-sounding reply
We invested in prompt to avoid robotic tone. Each reply names the policyholder, paraphrases the event (shows it understood), lists what is needed, gives protocol and deadline. Policyholder NPS rose 24 pts.
3. Classification with numeric confidence
The agent returns confidence (0–1) per category. Below 0.85, goes to human. Calibrated with 500 real cases. Drastically reduces auto classification error.
4. Closed learning loop
Each case the human corrects becomes an example in the gold set. We re-evaluated weekly for the first 8 weeks. Per-category recall rose 11 pts in the period.
What went wrong — and how we worked around it
Huge attachments
200-page PDF (medical report) blew the token limit. Solution: pre-summarize by chunks with Gemini Flash before sending to the main agent.
Sophisticated spam
Fake insurance billing landed in the inbox. We trained a specific spam/phishing classifier as the first pipeline stage.
Heavy-accent audio
Gemini 2.5 Pro improved a lot in regional PT-BR but still errs. When transcription confidence drops, the agent politely asks the policyholder to write or call.
Email chains
15-reply threads became confusing context. We added thread summarization as pre-processing.
The NPS gain from policyholders (+24 pts) surprised us more than the ROI: a human-sounding reply, immediate protocol and a clear deadline beat 18 hours of silence.
Replicability
The pattern works for any high-volume mailbox with complex rules: claims, industrial customer service, HR (onboarding), legal (subpoenas), bank back office. We detail the financial vertical in Gemini Enterprise for financial services.
Frequently Asked Questions sobre Email triage with an agent: how an insurer processes 8,000 messages/day
How many emails did the insurer process daily in the sinistros@ inbox? The insurer processed 8,000 emails per day in the sinistros@ inbox.
What was the average email classification time before the agent implementation? The average email classification time was 18 hours before the agent implementation.
What was the impact of the agent implementation on SLA compliance for classification? SLA compliance for classification increased from 34% to 97% after the agent implementation.
What is the percentage of automatic process opening after the agent implementation? Automatic process opening reached 73% after the agent implementation.
Does your inbox receive > 1,000 emails/day with complex rules?
30-minute diagnostic: volume, current SLA, attachments, line rules. Leaves with ROI estimate and 60-day plan. Replicable pattern for claims, industrial customer service, HR, legal.
