Training custom models on your own data
Most customers should not train their own model. Our shared bundles already cover parked-vehicle, damage, and forest-bay policies, and you benefit from the labelled data every other partner contributes back. Start there.
But sometimes you need a customer-specific detector — a private asset class, a competitor lockup, a one-off forensic check. The same pipeline that trains our shared bundles can train yours, and the resulting bundle dispatches and drifts just like any other.
Pipeline at a glance
raw photos
│
▼
verify_ai_ml_datasets ──┐
│ │
▼ │
verify_ai_ml_samples │ (COCO-style JSONB annotations,
│ │ split = train / validation / test)
▼ │
ml/ Python training ◄──┘
(RT-DETRv2-S detector,
multi-format export)
│
▼
verify_ai_ml_models ────► verify_ai_model_bundles
│
▼
ModelManager (SDK)
auto-downloadEverything except the SDK download happens server-side.
1. Build the dataset
A dataset is a row in verify_ai_ml_datasets plus N rows in
verify_ai_ml_samples. Each sample stores its image storage path,
the compliance ground truth, and a COCO-style annotation blob:
insert into verify_ai_ml_datasets (policy_id, name, status)
values ('pol_acme_parking', 'Acme parking v1', 'collecting')
returning id;
-- → '7b1c…' (UUID)
insert into verify_ai_ml_samples
(dataset_id, image_path, is_compliant, annotations, split, review_status)
values
('7b1c…', 'datasets/acme/0001.jpg',
true,
'{"objects":[{"class":"acme_scooter","bbox":[12,8,640,920]}]}'::jsonb,
'train', 'approved');verify_ai_ml_datasets.status accepts 'collecting' | 'ready' | 'training' | 'archived'. Flip from collecting to ready once
labelling is done. Split values match what the training loop expects:
| Split | Used for |
| ------------ | ---------------------------------------------------------- |
| train | Gradient updates. |
| validation | Early-stopping, hyperparameter tuning. |
| test | Held-out final metrics reported on the model row. |
Synthetic / augmented samples don't get their own split. They live in
the same dataset with is_augmented = true and augmentation_type
populated; the training loop filters them out of the test slice. The
generator under ml/data/ (diffusion-based augmenter) is preview
and not wired up in CI yet.
2. Train
Training runs in ml/ (Python). RT-DETRv2-S is the default detector;
the same Makefile downloads images, converts to COCO, trains,
exports to CoreML / TFLite / ONNX, and runs the gold-eval comparison:
cd ml
make all DATASET_ID=7b1c… POLICY_ID=pol_acme_parking
# Or run the stages individually:
# make download convert train export validate evalThe training step (make train) shells into
python training/train_rtdetr.py with training/config.yaml. Each
async job writes a row in verify_ai_ml_jobs (status, progress,
logs, metrics) and on success creates a verify_ai_ml_models row
with:
base_architecture(rt-detrv2-s)dataset_id→ the training dataset.coreml_path/tflite_path/onnx_pathplus matchingsha256_coreml/sha256_tflite/sha256_onnx.ontology_version/schema_version— the contract this model was trained against.- Metrics:
accuracy,precision_score,recall,f1_score,map50,map50_95,gemini_agreement,inference_time_ms.
3. Bundle
A model on its own can't be shipped — devices need the policy AST and
the ontology version too. That packaging is what
verify_ai_model_bundles is for. In practice make deploy runs
ml/deploy/register_bundle.py, which uploads each exported artifact
to the verify-ai-models storage bucket, computes per-artifact SHA-256s,
loads the policy AST from ml/config/policy_asts/<policy_id>.json,
and inserts a row equivalent to:
insert into verify_ai_model_bundles
(policy_id, bundle_version, ontology_version, schema_version,
policy_ast, artifacts, is_active,
training_dataset_id, git_sha, gold_eval_results)
values (
'pol_acme_parking',
(select coalesce(max(bundle_version), 0) + 1
from verify_ai_model_bundles
where policy_id = 'pol_acme_parking'),
'v3',
'1.0.0',
'{ /* policy AST loaded from ml/config/policy_asts/pol_acme_parking.json */ }'::jsonb,
jsonb_build_array(
jsonb_build_object(
'role', 'detector',
'architecture', 'rt-detrv2-s',
'version', 1,
'format', 'coreml',
'storage_path', 'pol_acme_parking/v1/model.mlpackage.zip',
'size_bytes', 18204113,
'sha256', 'a1b2…'
),
jsonb_build_object(
'role', 'detector',
'architecture', 'rt-detrv2-s',
'version', 1,
'format', 'tflite',
'storage_path', 'pol_acme_parking/v1/model.tflite',
'size_bytes', 16801004,
'sha256', 'c3d4…'
)
),
false, -- not active yet, see promotion gate below
'7b1c…'::uuid,
'<training repo commit>',
'{ /* mAP / per-class precision JSON */ }'::jsonb
);policy_ast lives on the bundle row itself — it is not read from
verify_ai_policies (which only carries the VLM-side config: provider,
model, system prompt, etc.). is_active = false keeps the bundle out
of /api/v1/models/latest results until we explicitly promote it.
4. Promotion gate
The convention before flipping is_active = true is to capture a
reference snapshot for this bundle into
verify_ai_model_reference_stats — that's what drift detection
compares against later. See
Detecting model drift for the snapshot
shape.
Promotion script (paraphrased):
await captureReferenceSnapshot(bundleId); // inserts a row with
// snapshot_type='promotion'
await supabase
.from("verify_ai_model_bundles")
.update({ is_active: true })
.eq("id", bundleId);The gate is convention, not a DB constraint. If you skip the snapshot
the bundle still goes live, but the hourly drift cron will record live
stats with drift_score = null for it — you get no drift signal until
a reference is captured.
5. Dispatch
Once the bundle is active, devices pick it up on their next
ModelManager.checkForUpdates() call. To roll it out gradually,
combine with dispatch targeting:
update verify_ai_model_bundles
set bundle_tier = 'canary',
weight = 10,
targeting = '{"customer_ids":["cus_acme"]}'::jsonb
where id = '...';That keeps the new bundle to 10% of Acme devices for a few days while
the drift PSI stabilises, then a follow-up update promotes it to
production at weight = 100.
Lineage and provenance
Every bundle carries enough metadata to reproduce or audit it (the
lineage columns were added to verify_ai_model_bundles in
20260620_model_dispatch_targeting.sql):
verify_ai_model_bundles.git_sha— the training repo commit.verify_ai_model_bundles.training_dataset_id— FK intoverify_ai_ml_datasets, so the exact sample rows are reachable.verify_ai_model_bundles.gold_eval_results— held-out metrics jsonb.verify_ai_model_bundles.artifacts[].sha256— per-artifact integrity.verify_ai_ml_modelskeeps the matching per-format metrics (map50,map50_95,inference_time_ms, etc.) and storage paths.
Auditors get a single chain from "this device decision" back to the dataset rows that trained the model. Don't break that chain by hand editing — use the migrations.
What's next
- Detecting model drift — what happens after promotion.
- Model dispatch and targeting — how to canary your new bundle to a slice of users first.