kaedax
← work

A clinician-facing chronic-care platform with patient intake, structured encounter notes, and an AI scribe that earns clinician trust.

CADUCEUS needed to ship a working clinician console — intake, scheduling, encounter notes, an AI-assisted scribe, and an audit trail — for a pilot with two clinics. Two cycles, sequenced. The scribe was the thing they were afraid would not work.

stack → Next.js 15 Postgres + pgvector Twilio (telehealth) Whisper + Claude Sentry Cloudflare Workers HIPAA-aware infra
AUDIO 12:34 cardiology TRANSCRIPTION Pt. reports chest tightnessOn atenolol 50mg qdBP 132/84, HR 78EKG: NSR, no acute ST ENCOUNTER · CARDIOLOGY v3 CHIEF COMPLAINT chest tightness · 3d ✓ MEDS atenolol 50mg qd ✓ VITALS BP 132/84 · HR 78 ✓ ASSESSMENT stable · no acute ✓ PLAN f/u 14d · TSH · CMP ✓ CLINICIAN CONFIRM SIGNED · DR. K.M. AUDIT LOG hash f4e9..b821 · phi-scrubbed 94% SCRIBE ACCURACY · CARDIO + ENDOCRINE COHORT ▷ CASE · CADUCEUS · BLUEPRINT HEALTHTECH · CHRONIC CARE
SCHEMATIC An abstract view of the CADUCEUS engagement — not a literal product screenshot. Built to communicate engineering shape, not surface design.

outcomes

01

94%

Scribe accuracy (cardio + endocrine)

02

−63%

Time-per-encounter note

03

0

PHI incidents in pilot window

04

T+1,402h

Pilot launch (2 cycles · 38h spare)

[ §01 ] the cycle

How 720 hours
actually ran.

  1. Cycle 1 · Day 01 — 30

    Console + intake + encounter notes

    First cycle: clinician console, patient intake flow, scheduling, structured encounter notes with a hard-coded template per condition. No AI in the loop yet — we wanted the human workflow correct before agents touched it.

    clinician console intake flow encounter template v1
  2. Cycle 2 · Day 01 — 18

    AI scribe + eval harness

    build.agent shipped the scribe — Whisper transcription, Claude-based note structuring, clinician confirmation step. Crucially, qa.agent built an eval harness that ran every PR against 47 anonymized historical transcripts. Below threshold, the PR didn't merge.

    scribe v1 eval harness 47 anonymized transcripts
  3. Cycle 2 · Day 19 — 26

    Privacy posture + audit trail

    Scoped IAM, encrypted-at-rest with customer-managed keys, signed audit log, ephemeral inference (no transcript retention beyond clinician confirmation). Their compliance counsel reviewed the architecture mid-cycle.

    IAM scoped CMK keys audit log
  4. Cycle 2 · Day 27 — 30

    Pilot rollout

    Two clinics, six clinicians, soft launch behind a feature flag. Daily check-ins with the founding clinician for the first 14 days post-launch. Scribe accuracy threshold met from day three.

    pilot live clinician training 30d on-call

[ §02 ] agent log · selected

What the loop
looked like.

cycle-log · caduceus
archived
T+120h [INFO] scope.agent encounter template per chronic condition · 7 conditions in pilot
T+240h [ >> ] build.agent scribe v1 shipped behind flag · clinician-confirmed structure
T+360h [WARN] qa.agent scribe accuracy below threshold on cardiology cohort · paged human
T+480h [ OK ] qa.agent added cardiology-specific prompt · accuracy back above threshold
T+600h [ OK ] monitor.agent phi access trail · 100% audit coverage
T+720h [ OK ] deploy.agent pilot live in 2 clinics · clinician latency p95 90ms

[ §03 ] notes from the cycle

CADUCEUS sits in the category of healthtech that fails ninety percent of the time: clinician-facing AI in a regulated workflow. The technical risk isn’t the model. The risk is shipping something a clinician will turn off after the second use because it gets their language wrong.

How we sized the engagement

Two cycles, sequenced — not concurrent. The first cycle shipped the human-only workflow. The second cycle layered the scribe on top. This sequencing was non-negotiable: we don’t build AI into a workflow that doesn’t yet exist as a human process.

What the eval harness actually looks like

qa.agent maintains a corpus of 47 anonymized historical transcripts (provided with explicit patient consent, redacted of identifiers, hashed). Every PR that touches the scribe is run against the corpus and scored on three axes — structural compliance with the encounter template, clinical-vocabulary fidelity, and omission rate of clinically significant information. Below threshold on any axis, the PR does not merge.

The corpus is the product. We treat it that way.

What HIPAA-aware actually means in our delivery

We’re not a HIPAA auditor and we don’t claim to be. What we do is bring a posture: scoped IAM by default, customer-managed encryption keys, signed audit trails, ephemeral inference, business-associate-agreement-ready architecture. We partnered with CADUCEUS’s compliance counsel from week two. The certifications belong to the client; the readiness is what we deliver.

from the founder

"I've been a part of three health-tech builds before this one. The first time the engineering team understood that the scribe is the clinical relationship — not just an LLM call."

— Founding clinician · CADUCEUS