Gemini API now has webhooks: the end of polling in Veo and batch pipelines
Google launched native webhooks in the Gemini API today — built on the Standard Webhooks spec. If you're still polling `operation.result` for Veo video generation or batch jobs, now is the time to rethink your architecture.
Fabiano Brito
CTO, Autenticare
The while not operation.done: sleep(5) pattern is code every team building on video generation or batch jobs knows by heart. It works, but it’s an antipattern: it blocks threads, wastes API calls on status checks, and introduces artificial latency proportional to the polling interval. With today’s Google AI Studio announcement, that code can — and should — be retired.
What was announced
Google launched webhook support in the Gemini API, covering three classes of events:
- Video generation complete —
video.generatedwith Veo (includingveo-3.1-generate-preview) - Batch job finished — notification on batch processing job completion
- Agentic workflow needs attention — trigger for human-in-the-loop in agent pipelines
The technical spec is Standard Webhooks (standardwebhooks.com): HTTP POST requests with HMAC-signed payloads, webhook-id + webhook-timestamp + webhook-signature headers, and at-least-once delivery semantics with exponential retry.
Two registration modes
The API supports two configuration modes:
Static (project-level): configured via the Google Cloud Console or admin API. All operations in the project send notifications to the registered endpoint. Ideal for production pipelines with a centralized receiver.
Dynamic (per-request): passed inline in the API call. Each job can specify its own endpoint — useful in multi-tenant architectures where each customer has their own receiver, or in test flows.
Python SDK: what the code looks like
Before (polling):
import time
from google import genai
client = genai.Client()
operation = client.models.generate_videos(
model="veo-3.1-generate-preview",
prompt="time-lapse of a server room at night",
)
# Polling loop — blocks the thread, wastes API calls
while not operation.done:
time.sleep(10)
operation = client.operations.get(operation)
video = operation.result.generated_videos[0]
After (webhook):
from google import genai
from google.genai import types
client = genai.Client()
# Dynamic registration: the job notifies its receiver on completion
operation = client.models.generate_videos(
model="veo-3.1-generate-preview",
prompt="time-lapse of a server room at night",
config=types.GenerateVideoConfig(
webhook=types.WebhookConfig(
uri="https://your-api.com/webhooks/gemini",
subscribed_events=["video.generated"],
)
),
)
# Returns immediately — no blocking
print(f"Job started: {operation.name}")
The receiver gets an HTTP POST with a signed payload:
# FastAPI receiver (minimal example)
from fastapi import FastAPI, Request, Header
import hmac, hashlib, base64
app = FastAPI()
WEBHOOK_SECRET = "your-secret-here"
@app.post("/webhooks/gemini")
async def handle_gemini_event(
request: Request,
webhook_signature: str = Header(None),
):
body = await request.body()
# Signature verification (Standard Webhooks)
expected = base64.b64encode(
hmac.new(WEBHOOK_SECRET.encode(), body, hashlib.sha256).digest()
).decode()
if not hmac.compare_digest(expected, webhook_signature.split(",")[-1]):
return {"error": "invalid signature"}, 401
payload = await request.json()
event_type = payload["type"] # "video.generated"
if event_type == "video.generated":
video_uri = payload["data"]["uri"]
# trigger downstream: save to GCS, notify user, etc.
return {"ok": True}
Why this changes the architecture
Polling was a convenience hack that worked for prototyping but doesn’t scale. Three real problems it creates in production:
1. Blocked threads. In synchronous Python, each in-progress job holds a thread waiting. With 50 concurrent generations, you need 50 workers just to sit and wait.
2. Artificial latency. A 10-second interval means you deliver the video, on average, 5 seconds later than necessary. With webhooks, the notification arrives within milliseconds of completion.
3. Fragility on restarts. If the process running the loop crashes, you lose the operation state. With webhooks, the receiver is stateless — Google redelivers if it doesn’t get a 2xx.
A webhook isn't just a faster notification — it's an inversion-of-control contract. The model calls you; you don't need to keep calling the model.
Standard Webhooks: the detail that matters for platform teams
The Gemini API adopting the Standard Webhooks spec is not a minor detail. It means the same signature validation code you already use for Stripe or GitHub events works here — just swap the secret and the header field name.
For teams that already have event ingestion infrastructure (a FastAPI/Express receiver, an SQS or Pub/Sub queue in front), integrating Gemini requires no new verification code. Just register the endpoint and add handlers for Gemini event types.
Implications for agentic pipelines
Beyond Veo and batch jobs, the announcement covers a third event: agent.needs_attention — fired when an agentic workflow reaches a decision point requiring human intervention.
This is the missing primitive for human-in-the-loop in production. Previously, you had to build an approval queue from scratch (database polling, WebSocket, or long-poll HTTP). With the native event, the agent pushes the notification to you — and the human reviewer can respond via API before the agent continues.
For teams building multi-step approval pipelines (KYC, contracts, risk analysis), this significantly reduces the complexity of the orchestration layer.
Need to migrate a Veo or batch pipeline to webhooks?
Autenticare designs and implements agentic pipelines with Gemini — including migration from polling-based flows to event-driven architectures with resilient receivers and retry queues.
