A clinician-facing chronic-care platform with patient intake, structured encounter notes, and an AI scribe that earns clinician trust.

CADUCEUS needed to ship a working clinician console — intake, scheduling, encounter notes, an AI-assisted scribe, and an audit trail — for a pilot with two clinics. Two cycles, sequenced. The scribe was the thing they were afraid would not work.

stack → Next.js 15 Postgres + pgvector Twilio (telehealth) Whisper + Claude Sentry Cloudflare Workers HIPAA-aware infra

SCHEMATIC An abstract view of the CADUCEUS engagement — not a literal product screenshot. Built to communicate engineering shape, not surface design.

▷ outcomes

94%

Scribe accuracy (cardio + endocrine)

−63%

Time-per-encounter note

PHI incidents in pilot window

T+1,402h

Pilot launch (2 cycles · 38h spare)

[ §01 ] the cycle

How 720 hours
actually ran.

Cycle 1 · Day 01 — 30

Console + intake + encounter notes

First cycle: clinician console, patient intake flow, scheduling, structured encounter notes with a hard-coded template per condition. No AI in the loop yet — we wanted the human workflow correct before agents touched it.

↳clinician console ↳intake flow ↳encounter template v1
Cycle 2 · Day 01 — 18

AI scribe + eval harness

build.agent shipped the scribe — Whisper transcription, Claude-based note structuring, clinician confirmation step. Crucially, qa.agent built an eval harness that ran every PR against 47 anonymized historical transcripts. Below threshold, the PR didn't merge.

↳scribe v1 ↳eval harness ↳47 anonymized transcripts
Cycle 2 · Day 19 — 26

Privacy posture + audit trail

Scoped IAM, encrypted-at-rest with customer-managed keys, signed audit log, ephemeral inference (no transcript retention beyond clinician confirmation). Their compliance counsel reviewed the architecture mid-cycle.

↳IAM scoped ↳CMK keys ↳audit log
Cycle 2 · Day 27 — 30

Pilot rollout

Two clinics, six clinicians, soft launch behind a feature flag. Daily check-ins with the founding clinician for the first 14 days post-launch. Scribe accuracy threshold met from day three.

↳pilot live ↳clinician training ↳30d on-call

[ §02 ] agent log · selected

What the loop
looked like.

verbatim · 6 lines · timestamps redacted

cycle-log · caduceus

archived

T+120h [INFO] scope.agent encounter template per chronic condition · 7 conditions in pilot

T+240h [ >> ] build.agent scribe v1 shipped behind flag · clinician-confirmed structure

T+360h [WARN] qa.agent scribe accuracy below threshold on cardiology cohort · paged human

T+480h [ OK ] qa.agent added cardiology-specific prompt · accuracy back above threshold

T+600h [ OK ] monitor.agent phi access trail · 100% audit coverage

T+720h [ OK ] deploy.agent pilot live in 2 clinics · clinician latency p95 90ms

[ §03 ] notes from the cycle

CADUCEUS sits in the category of healthtech that fails ninety percent of the time: clinician-facing AI in a regulated workflow. The technical risk isn’t the model. The risk is shipping something a clinician will turn off after the second use because it gets their language wrong.

How we sized the engagement

Two cycles, sequenced — not concurrent. The first cycle shipped the human-only workflow. The second cycle layered the scribe on top. This sequencing was non-negotiable: we don’t build AI into a workflow that doesn’t yet exist as a human process.

What the eval harness actually looks like

qa.agent maintains a corpus of 47 anonymized historical transcripts (provided with explicit patient consent, redacted of identifiers, hashed). Every PR that touches the scribe is run against the corpus and scored on three axes — structural compliance with the encounter template, clinical-vocabulary fidelity, and omission rate of clinically significant information. Below threshold on any axis, the PR does not merge.

The corpus is the product. We treat it that way.

What HIPAA-aware actually means in our delivery

We’re not a HIPAA auditor and we don’t claim to be. What we do is bring a posture: scoped IAM by default, customer-managed encryption keys, signed audit trails, ephemeral inference, business-associate-agreement-ready architecture. We partnered with CADUCEUS’s compliance counsel from week two. The certifications belong to the client; the readiness is what we deliver.

from the founder

"I've been a part of three health-tech builds before this one. The first time the engineering team understood that the scribe is the clinical relationship — not just an LLM call."

— Founding clinician · CADUCEUS

← previous case

Ecommerce · DTC home

BOUGH

next case →

Insurance · Embedded

ANCHOR

A clinician-facing chronic-care platform with patient intake, structured encounter notes, and an AI scribe that earns clinician trust.

How 720 hours actually ran.

Console + intake + encounter notes

AI scribe + eval harness

Privacy posture + audit trail

Pilot rollout

What the loop looked like.

How we sized the engagement

What the eval harness actually looks like

What HIPAA-aware actually means in our delivery

How 720 hours
actually ran.

What the loop
looked like.