I Read Claude Code's Agent Tooling, Then Built the One That Actually Finishes the Job

This is not a story about a journey. It is the complete work. I read the agent tooling everyone is starring, found the gap none of it filled, and built the piece that fills it. The gap was never reach. My agents could already see, scrape, and search. The thing that actually cost me hours was an agent that reads everything and still cannot finish a task without me standing over it. So this is about exactly that, in two lessons, each backed by the source it came from, copy-pasted so you can check me.

Lesson 1: The loop Claude ships is a timer, not a worker.

Claude Code has a built-in /loop. People assume it is an autonomous worker. It is not. The official command description, copied exactly from the skill registry:

“Run a prompt or slash command on a recurring interval (e.g. /loop 5m /foo). Omit the interval to let the model self-pace. When the user wants to set up a recurring task, poll for status, or run something repeatedly on an interval.” (source: Claude Code /loop, official command description)

Read it again. That is the entire behavior. It re-fires a prompt on a clock. It does not know what “done” means. It does not remember why the last attempt failed. It cannot undo a change it just broke. It is a fantastic primitive for polling and for cron-shaped repetition. It is not a thing you can hand a multi-step task and walk away from.

My blunt verdict after putting it on real work: as a task-finisher, /loop is garbage, and I can be specific about why. It cannot tell you when it is done, because it has no concept of done. It reruns a step that already passed and repeats a step that already failed, because it remembers nothing between ticks. It will fire into a broken state forever, because it never verifies a result and never rolls one back. It will overwrite your half-finished work without a second thought, because it has no safety gate. Point it at a real multi-step job and it does not finish the job, it just keeps knocking on the same door. That is not an insult to the tool. A timer was never built to be a worker. But the gap is real, and it is the whole reason /autonom exists.

Lesson 2: The worker I built, and exactly why it beats /loop.

/autonom is an open-source Claude Code skill, MIT licensed, a single Markdown file you drop into your skills folder: https://github.com/popescugeorgebogdan-debug/autonom. It takes the same starting idea as /loop and finishes it properly: scope the task exhaustively first, then run autonomously to a definition of done that a machine can actually verify, stopping only for a fixed set of safety gates.

Direct comparison: /loop vs /autonom

Capability	`/loop` (built-in)	`/autonom` (mine)
Core behavior	re-fires a prompt on an interval	scope, then autonomous loop, then verified DONE
Definition of “done”	none; runs until you stop it	refuses to start until DONE is a machine-checkable assertion (test rc=0 + assertions>0, endpoint status+body, file matches regex/AST)
Scoping before work	none	asks every outcome-determining question first, one dropdown at a time
Failure memory	none; each tick is blind	keys retries on `sha256(intent + error)`; resets on new information, escalates only on the same repeating failure
Stuck detection	none	strike logic + `HYPOTHESIS:` forcing-function + per-intent wall-time/token caps + cycle-detection
Undo on breakage	none	`git tag` before every mutation; `git reset --hard` on verify-fail; never resets over your uncommitted work
Crash safety	none	atomic `state.json` single-source-of-truth; wakeup carries a fuse, not the spec; idempotency keys + cross-run error journal
Verification	none	authoritative-signal allowlist; “looks right” and “no errors printed” are explicitly banned
Safety gates	none	push / delete / publish / spend always pause and ask, even mid-run
Spec size	one sentence of behavior	a 150-line contract with a 16-point robustness layer

That is not me being unfair to /loop. /loop is one sentence and is honest about being one sentence. /autonom is more detailed and more complete because it had to be: every row in that table is a countermeasure to a way I watched an unattended run fail.

The single most important design choice

Most “autonomous agent” wrappers copy the worst rule in the genre: stop after three failed attempts. That rule is wrong, and following it throws away the runs that were about to succeed. I learned it directly. I watched a run that the three-strikes rule would have killed solve the problem on the next attempt, because every “failure” had uncovered a new layer of the real cause.

So /autonom does not count bare attempts. From the skill’s own contract:

“Key the STOP counter on the (intent + normalized_error/diagnostic)-hash … INCREMENT the strike count only when an attempt repeats a PRIOR (intent,error) signature … RESET that intent’s strike count to 0 whenever an attempt surfaces a NEW error-hash, a new diagnostic fact, or a new hypothesis that is productive iteration, NOT a strike.” (source: autonom/SKILL.md)

It stops looping when it is genuinely stuck, and keeps pushing while it is still learning. That single distinction is the difference between a worker and a hamster wheel.

A confession that explains the whole design

I am a control freak and I own it. At the keyboard I babysit my agents and correct them constantly, because they make plenty of mistakes and I want to catch each one before it compounds. /autonom is for the other half of my life: asleep, or away and unable to hover. I do not actually trust an unattended agent, so I encoded my own paranoia into it. The checkable-DONE gate, the rollback-before-every-change, the stop-and-ask on anything dangerous. It is the babysitter I built so I can finally leave the room.

Take it

/autonom is MIT and intentionally tiny. The full contract, the state schema, the 16-point robustness layer, and the per-turn checklist are all in the one file:

→ https://github.com/popescugeorgebogdan-debug/autonom

Fork it, gut it, improve it. If you run unattended agents and have ever come back to one cheerfully looping, the “knowing when to quit” logic alone is worth the read.

Sources, all verified live for this piece: the Claude Code /loop official command description; and my own autonom/SKILL.md (149 lines canonical, 157 in the published repo). Every quote above is copy-pasted from its source.