kaedax
← work

An internal operations console with agent workers replacing six manual workflows — saving an estimated thirty-two operator-hours a week.

HARBOR's operations team ran six recurring workflows manually: invoice triage, customer-tier reconciliation, churn-risk flagging, ticket-tag normalization, partner-payout calculation, and report-distribution. We built an internal console with agent workers doing the toil and operators reviewing the calls.

stack → Next.js + tRPC Postgres Inngest Slack API Linear API Notion API Datadog
▷ INBOX QUEUE · ops items #2847 churn-risk ▶ workr #2846 invoice ▶ workr #2845 ticket-tag ▶ workr #2844 payout queued #2843 tier-recon queued #2842 report queued #2841 payout queued #2840 churn-risk queued ▷ AGENT WORKERS · 6 · narrow scope churn.worker acc 91% · AUTO invoice.worker acc 97% · AUTO tag.worker acc 94% · AUTO payout.worker acc 88% · REVIEW tier.worker acc 93% · AUTO report.worker acc 99% · AUTO ▷ OPERATOR CONSOLE THIS WEEK Items processed 1,847 Auto-executed 1,621 (88%) Operator override 127 (6.9%) Escalated 99 (5.4%) Hours reclaimed ~32 / week ▷ CASE · HARBOR · BLUEPRINT OPS · INTERNAL AUTOMATION
SCHEMATIC An abstract view of the HARBOR engagement — not a literal product screenshot. Built to communicate engineering shape, not surface design.

outcomes

01

~32h/wk

Operator hours redirected

02

4 / 6

Workflows live at cutover · 2 in shadow

03

< 8%

Operator override rate post-launch

04

0

Silent writes by any worker

[ §01 ] the cycle

How 720 hours
actually ran.

  1. Day 01 — 04

    Workflow audit + agent shape

    scope.agent shadowed each workflow with the operators for a half-day. Output: six workflow specs, each with explicit human-handoff points, escalation rules, and a list of edge cases the operators had implicitly handled.

    6 workflow specs handoff matrix edge case catalogue
  2. Day 05 — 17

    Console + worker build

    build.agent shipped the console (one queue view per workflow) and six narrow agent workers. Each worker proposes a decision; the operator confirms, edits, or escalates. No worker writes to a system of record without explicit human confirmation.

    console live 6 workers 0 silent writes
  3. Day 18 — 24

    Shadow-run + accuracy gates

    qa.agent ran every worker in shadow mode against the operators' real decisions for a week. Per-worker accuracy gates were set: workers below threshold required explicit operator review on every item; workers above threshold were allowed to auto-execute the unambiguous cases.

    shadow run · 7d accuracy gates set auto-exec thresholds
  4. Day 25 — 30

    Gradual cutover

    Operators kept the manual workflow as fallback for the first week. Each workflow cut over to the console as its worker hit threshold. By day 30, four of six were fully live; two stayed in shadow mode pending edge-case work.

    4/6 live 2/6 in shadow fallback retained

[ §02 ] agent log · selected

What the loop
looked like.

cycle-log · harbor
archived
T+120h [INFO] scope.agent 6 workflows mapped · 47 edge cases catalogued from operator interviews
T+240h [ >> ] build.agent console shipped · queue view × 6 · operator-first UX
T+360h [WARN] qa.agent churn-risk worker accuracy 81% · below threshold · needs more signal
T+480h [ OK ] build.agent churn-risk worker: added support-ticket sentiment signal · accuracy 91%
T+600h [ OK ] deploy.agent 4 workflows cutover · operators retain manual fallback for 14d
T+720h [ OK ] monitor.agent per-worker decision drift monitored · operator override rate < 8%

[ §03 ] notes from the cycle

HARBOR is a kind of engagement we get a lot more of than founders expect. Not every agent build is a customer-facing AI feature — many are internal tools that quietly take operator toil off the floor.

The default we negotiated up front

No silent writes. Every agent worker proposes a decision; an operator confirms, edits, or escalates. The high-accuracy workers eventually auto-execute the unambiguous cases, but that gate is per-worker, gradual, and reversible.

This is the right default for internal-ops automation in a growing company. The cost of a wrong write to a system of record (CRM, billing, ledger) is much higher than the benefit of one less operator click. We bias the design accordingly.

What “agent worker” actually means here

Each worker is a narrow program — a few hundred lines, a defined input contract, a defined output proposal, an explicit set of tools it can read from. They’re not general-purpose agents that “figure out the workflow.” They’re specific, evaluable, auditable, and individually replaceable.

The Operations VP can read the source of each worker in an afternoon. That’s a feature, not a regression.

What the operators do now

The four workflows that cut over still need operator attention — but on the edge cases, not the toil. Operators now spend their time on the ambiguous calls (the churn-risk flag that needs a human read of the customer’s last support ticket; the partner payout where a contract amendment changed the formula). The repetitive 80% is in the queue.

Several operators told us this is the first software change in their career where they felt their team got more important, not less.

from the founder

"I expected we'd buy a few hours back. We bought back four operator days a week, and the team likes the work they do now better than what we automated away."

— VP Operations · HARBOR