📄 540-commits-one-human.md13/06/2026
Documents › workflows

540 Commits. One Human.

A week and a half ago I stopped writing most of my own code. Not because I got lazy. Because I got a better seat.

540 commits landed in 11 days. One human behind every one. I wrote almost none of them by hand. I routed them.

On the loop, not in it

I don't sit inside the work any more. I sit on top of it. I plan, I route, I sign off at the gates that matter. A fleet of models does the execution. The intelligence isn't any single model. It is the system around them: who gets which job, who checks the result, what gets remembered.

The week, by the numbers

  • 540 commits in 11 days. Peak day: 106.
  • 6 tasks built overnight. Every one audited. Zero input.
  • A graph that went from 54% to 84% precise. Then 9.5x faster in a day.
  • A signed macOS app, 165 tests green. 600 green on the memory engine.
  • One loop ran away. £2. Capped.
  • One human behind all of it.

Plan in, product out

I handed a plan to a build loop and went to sleep. By morning there was a finished result on the desk. 6 tasks, every one independently audited, about two hours start to finish. I touched none of it.

A second model marks the homework

Nothing counts as done because the worker says so. A separate model audits every result before it lands. Pass, fail, or escalate. That single rule is what lets me stop checking the work.

Right model, right job

Five models, each on what it does best. Fable holds the orchestrating seat. Codex writes the heavy code. Kimi reads long video. MiniMax M3 runs looping builds. Local models hold the memory. Opus 4.8 when the seat needs more muscle.

Measure, then kill the darling

I built a "smarter" retrieval system and measured it against the dumb baseline. It pulled 29% more context and got it more wrong. So I cut it. The yardstick beats the opinion, even when the opinion is mine.

Rails, not babysitting

One overnight loop ran away. It burned about £2 before a hard cap caught it. I did not jump in at 3am. The rail did its job. When a system misbehaves, you cap it structurally, then you walk away.

What this is becoming

All of it is one system now. I call it ARI-OS. The memory layer is Cortex. The week and a half was not a sprint. It was the system getting good enough that I could step back and let it run.

The intelligence is the routing, not any single model.

Related