Capturing inspections over WhatsApp
In large parts of EMEA, LATAM, and India — and across the long tail of insurance and rental distribution — WhatsApp is the default support channel. End users will not download a fleet operator's mobile app to take an end-of-ride photo, and they won't follow a magic-link to a browser scanner either. They will, however, reply to a message.
This guide walks through wiring VerifyAI behind a WhatsApp Business
number so an operator can text a customer, get photos back, and run
each one through processVerification() — exactly the same pipeline
the mobile SDKs use.
The WhatsApp capture channel is in preview. You need a Twilio account, an approved WhatsApp Business sender, and Meta-approved message templates before it can be used in production. Sender approval takes 24-72 hours; template approval is a separate flow (see "Going live" below).
Architecture
Customer phone
│ (sends/receives WhatsApp messages)
▼
Twilio WhatsApp Business API
│ (HTTP webhook, signed)
▼
POST /api/v1/whatsapp/inbound
│ (verify X-Twilio-Signature, parse Body / MediaUrl0)
▼
WhatsApp state machine
(lib/verify-ai/whatsapp-state-machine.ts)
│ (advance state, download media via Twilio)
▼
processVerification() ──► same pipeline as the SDK
│
▼
Reply WhatsApp message (template or freeform)The state machine is server-side and stateful per phone number. A
row in verify_ai_whatsapp_sessions (migration
20260610_whatsapp_sessions.sql) keeps the current state, captured
media URLs, and the customer / policy context across the
multi-message conversation.
The conversation state machine
Each session moves through these states:
| State | Trigger | Reply |
| ---------------- | -------------------------------------- | -------------------------------------------------------------------- |
| greeting | First inbound from a new number | Localized greeting template, asks consent. |
| consent | User replies "yes" / "1" / equivalent | Sends instructions for the first photo. |
| instructions | After consent | Step-by-step "how to take a good photo" guidance. |
| capture_front | Media uploaded for front shot | Runs verification; on pass, prompts for back shot. |
| capture_back | Media uploaded for back shot | Runs verification; on pass, prompts for damage shot. |
| capture_damage | Media uploaded for damage shot | Final verification; transitions to review. |
| review | All photos captured | Transient — the webhook resolves the final outcome from the latest verification. |
| submitted | Review resolved | Sends compliant/non-compliant confirmation; session is closed and completed_verification_id is recorded. |
| abandoned | 24h idle, opt-out, or 3 retries in a step | Session is closed; further inbound from the same number starts a fresh session. |
Missing or unusable inbound (no image where one was expected) loops
back to the same capture_* state with a retry message. The state
machine allows up to 3 retries per step before moving to
abandoned. A plain "no" / "stop" / "opt out" at any non-terminal
state also moves the session to abandoned immediately.
The capture_damage step additionally accepts a "none" / "no
damage" sentinel — if the customer texts that instead of a photo,
the session transitions straight to review with no damage shot.
Security: verifying Twilio signatures
Every inbound POST from Twilio carries an X-Twilio-Signature
header. The webhook handler at app/api/v1/whatsapp/inbound/route.ts
recomputes the signature (HMAC-SHA1 over the full URL plus sorted
form parameters, keyed by your Twilio auth token) and rejects
mismatches with 401. Do not disable this check in production —
without it, anyone can POST inbound payloads and trigger
verifications on your dime.
Handling media
Twilio doesn't push the image bytes — it pushes a MediaUrl0 URL
that needs to be fetched with Basic auth (TWILIO_ACCOUNT_SID as
username, TWILIO_AUTH_TOKEN as password). The handler:
- Downloads the bytes with
downloadTwilioMedia()(Basic auth using the Twilio account SID + auth token). - Generates a verification ID and uploads to the
verify-ai-imagesbucket viauploadVerificationImage(). The path follows the standard<customerId>/<year>/<month>/<verificationId>.<ext>shape. - Calls
processVerification()with the bytes (base64), the policy bound to the session row, and tagged metadata.
The verification result is stored in verify_ai_verifications like
any other capture, with the following metadata pinned for
traceability:
{
"channel": "whatsapp",
"whatsapp_session_id": "<uuid>",
"capture_step": "front" | "back" | "damage"
}Idle timeout
A session that has been idle for more than 24 hours since
last_message_at is treated as abandoned. The idle check is
enforced in two places:
- The webhook query that finds an active session for an inbound
message excludes terminal states, and the state machine's
isIdleTimedOut()is re-checked on entry as a safety net. - Any new inbound from the same phone after timeout starts a fresh session at the webhook layer.
There is no separate scheduled sweep today — abandonment is recognized lazily on the next inbound. Don't rely on the customer to gracefully end the conversation; they won't.
Internationalization
The state machine reads its outbound strings from
lib/verify-ai/i18n via the t(locale, key) helper. Four locales
are seeded today:
| Locale | Status |
| ------ | -------------------------------------- |
| en | Available — en.json. |
| es | Available — es.json. |
| fr | Available — fr.json. |
| de | Available — de.json. |
| pt-BR| Roadmap. |
| hi | Roadmap. |
Locale is resolved at session-create time and stored on
context.locale. Unsupported codes fall back to en via
resolveLocale(). Both yes/no keyword matching and the
damage-skip "none" sentinel recognize multi-language variants
(yes, si, oui, ja; no, nein, non; none, ninguno,
aucun, keine).
Required environment variables
# .env
TWILIO_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TWILIO_AUTH_TOKEN=your_twilio_auth_token
TWILIO_WHATSAPP_FROM=whatsapp:+14155551234 # your WhatsApp senderThese are read by lib/notifications/whatsapp.ts (outbound) and
the inbound webhook handler. All three are required in every
environment.
Going live — provisioning checklist for ops
- Create a Twilio account and complete business verification.
- Request a WhatsApp Business sender in the Twilio console (24-72h approval).
- Submit message templates to Meta for approval. At minimum you need: greeting, consent prompt, capture instructions, retry prompt, exhausted prompt, success confirmation. Each one needs localized variants per supported locale.
- Set the inbound webhook URL on the Twilio sender to
https://<your-host>/api/v1/whatsapp/inbound. - Populate
TWILIO_*env vars in prod. - Run a smoke test: send a freeform inbound from an approved tester number and confirm the session row is created.
Once a verification fires, everything downstream is the same as
the mobile SDK path — webhooks, dashboard review, and the
chargeback defense workflow
all consume verify_ai_verifications records identically.
What's next
- Verifying parked vehicles — the canonical end-to-end policy walkthrough.
- Chargeback defense — how a WhatsApp-captured photo flows into a retailer dispute.
- Webhooks — receive events for sessions that close, abandon, or pass review.