Email triage with an agent: how an insurer processes 8,000 messages/day

Email triage with an AI agent is the automated classification, routing, and data extraction of high-volume customer emails and heterogeneous attachments. For enterprises, deploying this technology can slash classification times from 18 hours to 90 seconds and boost SLA compliance from 34% to 97% while reallocating staff to fraud analysis.

TL;DR Mid-size Brazilian insurer (~1.5M policyholders) replaced manual claims triage with a Gemini Enterprise agent. In 60 days: classification SLA from 18h to 90s, 73% automatic claim opening, Procon complaints from 142 to 38, and 12 triagers reallocated to fraud analysis.

8,000

Emails/day
claims@ inbox

18h → 90s

Classification time
promised SLA: 4h

34% → 97%

SLA met
before vs after

Email triage is the most underestimated use case in enterprise AI. It sounds simple — "read the email and classify it". At real scale, with 6 dialects of business rules, 12 insurance lines and heterogeneous attachments, this is where 80% of pilots die from underestimating complexity.

This case shows what works when the problem gets the respect it deserves.

Starting point

Central sinistros@ mailbox receives 8,000 emails/day.
20 human triagers classified, routed or requested complements.
Average classification time: 18 hours (promised SLA: 4h).
Attachments: PDFs (police report, medical report, photos), images, forwarded WhatsApp audio.
Chronic Monday backlog (~14k messages).
Recurring Procon complaint about delays.

What the agent does

Receives the email via Gmail API + webhook.
Reads body + all attachments (Gemini 2.5 Pro multimodal: PDF, image, audio).
Identifies the line (auto, home, life, health, liability, equipment).
Identifies the policyholder: cross-references mentioned CPF/policy with the core.
Classifies the type: new claim notice, complement to existing claim, question, complaint, spam.
Extracts structured data: event date, location, description, estimated value, attached documents, witness data.
Decides the next action:
- New claim with complete data → opens claim in core.
- New claim with missing data → replies with checklist of what is missing.
- Complement → links to existing claim.
- Question → routes to customer service with context.
- Complaint → escalates to ombudsman with severity classification.
Replies to the policyholder in clear PT-BR with protocol number.
Logs everything: original email, classification, actions, sent reply.

Architecture

Gemini Enterprise Plus: orchestration + model access.
Gemini 2.5 Pro: multimodal reading (email + attachments).
Gemini 2.5 Flash: fast classification and routing (cost fallback).
Vertex AI Search: knowledge base (underwriting policies, product manuals, anonymized prior decisions).
Tools:
- Policyholder lookup (CPF/policy → data).
- Claim opening in core (SOAP via Apigee).
- Complement linking.
- Reply via Gmail.
- Routing to ombudsman/customer service.
Cloud Run: webhook + retry logic.
BigQuery: structured log for analysis.

Results in 60 days

Metric	Before	After	Delta
Average classification time	18 h	90 s	−99.8%
Volume processed	8,000/day	8,000/day	=
Automatic claim opening	0%	73%	+73 pts
Cases still requiring human	100%	27%	−73 pts
SLA met	34%	97%	+63 pts
Procon complaints (3 months)	142	38	−73%
Human triagers	20	8	12 reallocated

The 12 reallocated triagers moved to complex claim analysis (potential fraud, multi-victim cases), an area previously outsourced. Full project ROI: 4 months.

What worked — and why

1. True multimodal

OCR was not enough. Smudged police report in a cell phone photo, witness audio in WhatsApp — Gemini 2.5 Pro reads it all directly, without a separate transcription pipeline.

2. Human-sounding reply

We invested in prompt to avoid robotic tone. Each reply names the policyholder, paraphrases the event (shows it understood), lists what is needed, gives protocol and deadline. Policyholder NPS rose 24 pts.

3. Classification with numeric confidence

The agent returns confidence (0–1) per category. Below 0.85, goes to human. Calibrated with 500 real cases. Drastically reduces auto classification error.

4. Closed learning loop

Each case the human corrects becomes an example in the gold set. We re-evaluated weekly for the first 8 weeks. Per-category recall rose 11 pts in the period.

What went wrong — and how we worked around it

Huge attachments

200-page PDF (medical report) blew the token limit. Solution: pre-summarize by chunks with Gemini Flash before sending to the main agent.

Sophisticated spam

Fake insurance billing landed in the inbox. We trained a specific spam/phishing classifier as the first pipeline stage.

Heavy-accent audio

Gemini 2.5 Pro improved a lot in regional PT-BR but still errs. When transcription confidence drops, the agent politely asks the policyholder to write or call.

Email chains

15-reply threads became confusing context. We added thread summarization as pre-processing.

⚠️ Non-negotiable SUSEP governance Audit log with hash of original email + decisions. Final human decision in cases above R$ 50k exposure. Quarterly bias evaluation (denial by neighborhood, gender, age). DPIA approved by DPO before go-live. Continuity plan: if agent goes down, queue returns to human triagers without loss.

The NPS gain from policyholders (+24 pts) surprised us more than the ROI: a human-sounding reply, immediate protocol and a clear deadline beat 18 hours of silence.

Replicability

The pattern works for any high-volume mailbox with complex rules: claims, industrial customer service, HR (onboarding), legal (subpoenas), bank back office. We detail the financial vertical in Gemini Enterprise for financial services.

Frequently Asked Questions sobre Email triage with an agent: how an insurer processes 8,000 messages/day

How many emails did the insurer process daily in the sinistros@ inbox? The insurer processed 8,000 emails per day in the sinistros@ inbox.

What was the average email classification time before the agent implementation? The average email classification time was 18 hours before the agent implementation.

What was the impact of the agent implementation on SLA compliance for classification? SLA compliance for classification increased from 34% to 97% after the agent implementation.

What is the percentage of automatic process opening after the agent implementation? Automatic process opening reached 73% after the agent implementation.

Intelligent triage

Does your inbox receive > 1,000 emails/day with complex rules?

30-minute diagnostic: volume, current SLA, attachments, line rules. Leaves with ROI estimate and 60-day plan. Replicable pattern for claims, industrial customer service, HR, legal.

Schedule diagnostic → Financial accelerator