Autenticare
Use Cases · · 8 min

Email triage with an agent: how an insurer processes 8,000 messages/day

Claims mailbox received 8,000 emails/day with broken SLA. Gemini Enterprise agent classifies, extracts data, opens claim in core system and replies — in 90 seconds per email.

Fabiano Brito

Fabiano Brito

CEO & Founder

Email triage with an agent: how an insurer processes 8,000 messages/day
TL;DR Mid-size Brazilian insurer (~1.5M policyholders) replaced manual claims triage with a Gemini Enterprise agent. In 60 days: classification SLA from 18h to 90s, 73% automatic claim opening, Procon complaints from 142 to 38, and 12 triagers reallocated to fraud analysis.
8,000
Emails/day
claims@ inbox
18h → 90s
Classification time
promised SLA: 4h
34% → 97%
SLA met
before vs after

Email triage is the most underestimated use case in enterprise AI. It sounds simple — "read the email and classify it". At real scale, with 6 dialects of business rules, 12 insurance lines and heterogeneous attachments, this is where 80% of pilots die from underestimating complexity.

This case shows what works when the problem gets the respect it deserves.


Starting point

  • Central sinistros@ mailbox receives 8,000 emails/day.
  • 20 human triagers classified, routed or requested complements.
  • Average classification time: 18 hours (promised SLA: 4h).
  • Attachments: PDFs (police report, medical report, photos), images, forwarded WhatsApp audio.
  • Chronic Monday backlog (~14k messages).
  • Recurring Procon complaint about delays.

What the agent does

  1. Receives the email via Gmail API + webhook.
  2. Reads body + all attachments (Gemini 2.5 Pro multimodal: PDF, image, audio).
  3. Identifies the line (auto, home, life, health, liability, equipment).
  4. Identifies the policyholder: cross-references mentioned CPF/policy with the core.
  5. Classifies the type: new claim notice, complement to existing claim, question, complaint, spam.
  6. Extracts structured data: event date, location, description, estimated value, attached documents, witness data.
  7. Decides the next action:
    • New claim with complete data → opens claim in core.
    • New claim with missing data → replies with checklist of what is missing.
    • Complement → links to existing claim.
    • Question → routes to customer service with context.
    • Complaint → escalates to ombudsman with severity classification.
  8. Replies to the policyholder in clear PT-BR with protocol number.
  9. Logs everything: original email, classification, actions, sent reply.

Architecture

  • Gemini Enterprise Plus: orchestration + model access.
  • Gemini 2.5 Pro: multimodal reading (email + attachments).
  • Gemini 2.5 Flash: fast classification and routing (cost fallback).
  • Vertex AI Search: knowledge base (underwriting policies, product manuals, anonymized prior decisions).
  • Tools:
    • Policyholder lookup (CPF/policy → data).
    • Claim opening in core (SOAP via Apigee).
    • Complement linking.
    • Reply via Gmail.
    • Routing to ombudsman/customer service.
  • Cloud Run: webhook + retry logic.
  • BigQuery: structured log for analysis.

Results in 60 days

MetricBeforeAfterDelta
Average classification time18 h90 s−99.8%
Volume processed8,000/day8,000/day=
Automatic claim opening0%73%+73 pts
Cases still requiring human100%27%−73 pts
SLA met34%97%+63 pts
Procon complaints (3 months)14238−73%
Human triagers20812 reallocated

The 12 reallocated triagers moved to complex claim analysis (potential fraud, multi-victim cases), an area previously outsourced. Full project ROI: 4 months.


What worked — and why

1. True multimodal

OCR was not enough. Smudged police report in a cell phone photo, witness audio in WhatsApp — Gemini 2.5 Pro reads it all directly, without a separate transcription pipeline.

2. Human-sounding reply

We invested in prompt to avoid robotic tone. Each reply names the policyholder, paraphrases the event (shows it understood), lists what is needed, gives protocol and deadline. Policyholder NPS rose 24 pts.

3. Classification with numeric confidence

The agent returns confidence (0–1) per category. Below 0.85, goes to human. Calibrated with 500 real cases. Drastically reduces auto classification error.

4. Closed learning loop

Each case the human corrects becomes an example in the gold set. We re-evaluated weekly for the first 8 weeks. Per-category recall rose 11 pts in the period.


What went wrong — and how we worked around it

Huge attachments

200-page PDF (medical report) blew the token limit. Solution: pre-summarize by chunks with Gemini Flash before sending to the main agent.

Sophisticated spam

Fake insurance billing landed in the inbox. We trained a specific spam/phishing classifier as the first pipeline stage.

Heavy-accent audio

Gemini 2.5 Pro improved a lot in regional PT-BR but still errs. When transcription confidence drops, the agent politely asks the policyholder to write or call.

Email chains

15-reply threads became confusing context. We added thread summarization as pre-processing.


⚠️ Non-negotiable SUSEP governance Audit log with hash of original email + decisions. Final human decision in cases above R$ 50k exposure. Quarterly bias evaluation (denial by neighborhood, gender, age). DPIA approved by DPO before go-live. Continuity plan: if agent goes down, queue returns to human triagers without loss.
The NPS gain from policyholders (+24 pts) surprised us more than the ROI: a human-sounding reply, immediate protocol and a clear deadline beat 18 hours of silence.

Replicability

The pattern works for any high-volume mailbox with complex rules: claims, industrial customer service, HR (onboarding), legal (subpoenas), bank back office. We detail the financial vertical in Gemini Enterprise for financial services.

Intelligent triage

Does your inbox receive > 1,000 emails/day with complex rules?

30-minute diagnostic: volume, current SLA, attachments, line rules. Leaves with ROI estimate and 60-day plan. Replicable pattern for claims, industrial customer service, HR, legal.


Also read