I Read Claude Code's Agent Tooling, Then Built the One That Actually Finishes the Job
24 June 2026This is not a story about a journey. It is the complete work. I read the agent tooling everyone is starring, found the gap none of it filled, and built the piece that fills it. The gap was never reach. My agents could already see, scrape, and search. The thing that actually cost me hours was an agent that reads everything and still cannot finish a task without me standing over it. So this is about exactly that, in two lessons, each backed by the source it came from, copy-pasted so you can check me.
Lesson 1: The loop Claude ships is a timer, not a worker.
Claude Code has a built-in /loop. People assume it is an autonomous worker. It is not. The official
command description, copied exactly from the skill registry:
“Run a prompt or slash command on a recurring interval (e.g. /loop 5m /foo). Omit the interval to let the model self-pace. When the user wants to set up a recurring task, poll for status, or run something repeatedly on an interval.” (source: Claude Code
/loop, official command description)
Read it again. That is the entire behavior. It re-fires a prompt on a clock. It does not know what “done” means. It does not remember why the last attempt failed. It cannot undo a change it just broke. It is a fantastic primitive for polling and for cron-shaped repetition. It is not a thing you can hand a multi-step task and walk away from.
My blunt verdict after putting it on real work: as a task-finisher, /loop is garbage, and I can be
specific about why. It cannot tell you when it is done, because it has no concept of done. It reruns a
step that already passed and repeats a step that already failed, because it remembers nothing between
ticks. It will fire into a broken state forever, because it never verifies a result and never rolls
one back. It will overwrite your half-finished work without a second thought, because it has no
safety gate. Point it at a real multi-step job and it does not finish the job, it just keeps knocking
on the same door. That is not an insult to the tool. A timer was never built to be a worker. But the
gap is real, and it is the whole reason /autonom exists.
Lesson 2: The worker I built, and exactly why it beats /loop.
/autonom is an open-source Claude Code skill, MIT licensed, a single Markdown file you drop into your
skills folder: https://github.com/popescugeorgebogdan-debug/autonom. It takes the same starting
idea as /loop and finishes it properly: scope the task exhaustively first, then run autonomously
to a definition of done that a machine can actually verify, stopping only for a fixed set of safety
gates.
Direct comparison: /loop vs /autonom
| Capability | /loop (built-in) | /autonom (mine) |
|---|---|---|
| Core behavior | re-fires a prompt on an interval | scope, then autonomous loop, then verified DONE |
| Definition of “done” | none; runs until you stop it | refuses to start until DONE is a machine-checkable assertion (test rc=0 + assertions>0, endpoint status+body, file matches regex/AST) |
| Scoping before work | none | asks every outcome-determining question first, one dropdown at a time |
| Failure memory | none; each tick is blind | keys retries on sha256(intent + error); resets on new information, escalates only on the same repeating failure |
| Stuck detection | none | strike logic + HYPOTHESIS: forcing-function + per-intent wall-time/token caps + cycle-detection |
| Undo on breakage | none | git tag before every mutation; git reset --hard on verify-fail; never resets over your uncommitted work |
| Crash safety | none | atomic state.json single-source-of-truth; wakeup carries a fuse, not the spec; idempotency keys + cross-run error journal |
| Verification | none | authoritative-signal allowlist; “looks right” and “no errors printed” are explicitly banned |
| Safety gates | none | push / delete / publish / spend always pause and ask, even mid-run |
| Spec size | one sentence of behavior | a 150-line contract with a 16-point robustness layer |
That is not me being unfair to /loop. /loop is one sentence and is honest about being one
sentence. /autonom is more detailed and more complete because it had to be: every row in that table
is a countermeasure to a way I watched an unattended run fail.
The single most important design choice
Most “autonomous agent” wrappers copy the worst rule in the genre: stop after three failed attempts. That rule is wrong, and following it throws away the runs that were about to succeed. I learned it directly. I watched a run that the three-strikes rule would have killed solve the problem on the next attempt, because every “failure” had uncovered a new layer of the real cause.
So /autonom does not count bare attempts. From the skill’s own contract:
“Key the STOP counter on the (intent + normalized_error/diagnostic)-hash … INCREMENT the strike count only when an attempt repeats a PRIOR (intent,error) signature … RESET that intent’s strike count to 0 whenever an attempt surfaces a NEW error-hash, a new diagnostic fact, or a new hypothesis that is productive iteration, NOT a strike.” (source: autonom/SKILL.md)
It stops looping when it is genuinely stuck, and keeps pushing while it is still learning. That single distinction is the difference between a worker and a hamster wheel.
A confession that explains the whole design
I am a control freak and I own it. At the keyboard I babysit my agents and correct them constantly,
because they make plenty of mistakes and I want to catch each one before it compounds. /autonom is
for the other half of my life: asleep, or away and unable to hover. I do not actually trust an
unattended agent, so I encoded my own paranoia into it. The checkable-DONE gate, the
rollback-before-every-change, the stop-and-ask on anything dangerous. It is the babysitter I built so
I can finally leave the room.
Take it
/autonom is MIT and intentionally tiny. The full contract, the state schema, the 16-point robustness
layer, and the per-turn checklist are all in the one file:
→ https://github.com/popescugeorgebogdan-debug/autonom
Fork it, gut it, improve it. If you run unattended agents and have ever come back to one cheerfully looping, the “knowing when to quit” logic alone is worth the read.
Sources, all verified live for this piece: the Claude Code /loop official command description; and
my own autonom/SKILL.md (149 lines canonical, 157 in the published repo). Every quote above is
copy-pasted from its source.
Comments
Comments are moderated before they appear; your name and message become public.
Send me a message about this post
Private message · lands straight in my inbox.