Capturing inspections over WhatsApp

In large parts of EMEA, LATAM, and India — and across the long tail of insurance and rental distribution — WhatsApp is the default support channel. End users will not download a fleet operator's mobile app to take an end-of-ride photo, and they won't follow a magic-link to a browser scanner either. They will, however, reply to a message.

This guide walks through wiring VerifyAI behind a WhatsApp Business number so an operator can text a customer, get photos back, and run each one through processVerification() — exactly the same pipeline the mobile SDKs use.

Preview feature

The WhatsApp capture channel is in preview. You need a Twilio account, an approved WhatsApp Business sender, and Meta-approved message templates before it can be used in production. Sender approval takes 24-72 hours; template approval is a separate flow (see "Going live" below).

Architecture

plaintext
Customer phone
      │  (sends/receives WhatsApp messages)

Twilio WhatsApp Business API
      │  (HTTP webhook, signed)

POST /api/v1/whatsapp/inbound
      │  (verify X-Twilio-Signature, parse Body / MediaUrl0)

WhatsApp state machine
  (lib/verify-ai/whatsapp-state-machine.ts)
      │  (advance state, download media via Twilio)

processVerification()  ──►  same pipeline as the SDK


Reply WhatsApp message (template or freeform)

The state machine is server-side and stateful per phone number. A row in verify_ai_whatsapp_sessions (migration 20260610_whatsapp_sessions.sql) keeps the current state, captured media URLs, and the customer / policy context across the multi-message conversation.

The conversation state machine

Each session moves through these states:

| State | Trigger | Reply | | ---------------- | -------------------------------------- | -------------------------------------------------------------------- | | greeting | First inbound from a new number | Localized greeting template, asks consent. | | consent | User replies "yes" / "1" / equivalent | Sends instructions for the first photo. | | instructions | After consent | Step-by-step "how to take a good photo" guidance. | | capture_front | Media uploaded for front shot | Runs verification; on pass, prompts for back shot. | | capture_back | Media uploaded for back shot | Runs verification; on pass, prompts for damage shot. | | capture_damage | Media uploaded for damage shot | Final verification; transitions to review. | | review | All photos captured | Transient — the webhook resolves the final outcome from the latest verification. | | submitted | Review resolved | Sends compliant/non-compliant confirmation; session is closed and completed_verification_id is recorded. | | abandoned | 24h idle, opt-out, or 3 retries in a step | Session is closed; further inbound from the same number starts a fresh session. |

Missing or unusable inbound (no image where one was expected) loops back to the same capture_* state with a retry message. The state machine allows up to 3 retries per step before moving to abandoned. A plain "no" / "stop" / "opt out" at any non-terminal state also moves the session to abandoned immediately.

The capture_damage step additionally accepts a "none" / "no damage" sentinel — if the customer texts that instead of a photo, the session transitions straight to review with no damage shot.

Security: verifying Twilio signatures

Every inbound POST from Twilio carries an X-Twilio-Signature header. The webhook handler at app/api/v1/whatsapp/inbound/route.ts recomputes the signature (HMAC-SHA1 over the full URL plus sorted form parameters, keyed by your Twilio auth token) and rejects mismatches with 401. Do not disable this check in production — without it, anyone can POST inbound payloads and trigger verifications on your dime.

Handling media

Twilio doesn't push the image bytes — it pushes a MediaUrl0 URL that needs to be fetched with Basic auth (TWILIO_ACCOUNT_SID as username, TWILIO_AUTH_TOKEN as password). The handler:

  1. Downloads the bytes with downloadTwilioMedia() (Basic auth using the Twilio account SID + auth token).
  2. Generates a verification ID and uploads to the verify-ai-images bucket via uploadVerificationImage(). The path follows the standard <customerId>/<year>/<month>/<verificationId>.<ext> shape.
  3. Calls processVerification() with the bytes (base64), the policy bound to the session row, and tagged metadata.

The verification result is stored in verify_ai_verifications like any other capture, with the following metadata pinned for traceability:

json
{
  "channel": "whatsapp",
  "whatsapp_session_id": "<uuid>",
  "capture_step": "front" | "back" | "damage"
}

Idle timeout

A session that has been idle for more than 24 hours since last_message_at is treated as abandoned. The idle check is enforced in two places:

  • The webhook query that finds an active session for an inbound message excludes terminal states, and the state machine's isIdleTimedOut() is re-checked on entry as a safety net.
  • Any new inbound from the same phone after timeout starts a fresh session at the webhook layer.

There is no separate scheduled sweep today — abandonment is recognized lazily on the next inbound. Don't rely on the customer to gracefully end the conversation; they won't.

Internationalization

The state machine reads its outbound strings from lib/verify-ai/i18n via the t(locale, key) helper. Four locales are seeded today:

| Locale | Status | | ------ | -------------------------------------- | | en | Available — en.json. | | es | Available — es.json. | | fr | Available — fr.json. | | de | Available — de.json. | | pt-BR| Roadmap. | | hi | Roadmap. |

Locale is resolved at session-create time and stored on context.locale. Unsupported codes fall back to en via resolveLocale(). Both yes/no keyword matching and the damage-skip "none" sentinel recognize multi-language variants (yes, si, oui, ja; no, nein, non; none, ninguno, aucun, keine).

Required environment variables

bash
# .env
TWILIO_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TWILIO_AUTH_TOKEN=your_twilio_auth_token
TWILIO_WHATSAPP_FROM=whatsapp:+14155551234   # your WhatsApp sender

These are read by lib/notifications/whatsapp.ts (outbound) and the inbound webhook handler. All three are required in every environment.

Going live — provisioning checklist for ops

  1. Create a Twilio account and complete business verification.
  2. Request a WhatsApp Business sender in the Twilio console (24-72h approval).
  3. Submit message templates to Meta for approval. At minimum you need: greeting, consent prompt, capture instructions, retry prompt, exhausted prompt, success confirmation. Each one needs localized variants per supported locale.
  4. Set the inbound webhook URL on the Twilio sender to https://<your-host>/api/v1/whatsapp/inbound.
  5. Populate TWILIO_* env vars in prod.
  6. Run a smoke test: send a freeform inbound from an approved tester number and confirm the session row is created.
Cross-references

Once a verification fires, everything downstream is the same as the mobile SDK path — webhooks, dashboard review, and the chargeback defense workflow all consume verify_ai_verifications records identically.

What's next

Get in Touch

Questions about pricing, integrations, or custom deployments? We'd love to hear from you.