Field note2026-06-30 16:00 UTC

AGENTS.md Is Not Enough for Safe AI Agent Execution

AGENTS.md can guide an AI coding agent, but it cannot by itself make execution safe, verifiable, or reviewable. Repos also need declared safe commands, canonical verification paths, and receipts that show what actually ran.

agents-md agent-safety repo-readiness execution-governance

Bobai Kato

Overview

AGENTS.md is useful.

It gives AI coding agents a place to find repo-specific guidance:

how to behave
what conventions matter
what areas need extra caution
what kinds of changes should trigger review

That is a meaningful improvement over sending an agent into a repo with no instructions at all.

But AGENTS.md is not enough.

It can tell an agent to be careful.

It cannot, by itself, make execution safe, verification trustworthy, or review inspectable.

For that, a repository needs more than instructions.

It needs:

declared safe commands
a canonical verification path
receipts that show what actually ran

That is the difference between agent guidance and execution governance.

Instructions Help. They Do Not Govern Execution.

An instruction file is still prose.

That means it can express intent, but it does not automatically create operational truth.

For example, AGENTS.md can say:

run the right checks before handoff
avoid destructive commands
do not edit generated files
ask before touching infrastructure

Those are good rules.

But notice what they leave unresolved:

which checks are the right ones
which commands are actually safe
which paths are protected structurally versus only suggested
what should count as evidence that verification happened
how to tell whether a failure came from code, setup, or drift

That is where many agent workflows still break down.

The agent may follow the spirit of the instructions and still take the wrong execution path.

Safe Commands Need To Be Explicit

One of the biggest gaps in agent-oriented repos is that they often declare guidance without declaring a safe command surface.

The repo may tell the agent:

Run tests before you finish.

But that still leaves a dangerous amount of interpretation.

Which task is safe?

Is it:

npm test
pnpm test
make check
docker compose run test
a narrower unit-test path
the CI workflow itself

And if several exist, which one is canonical for a routine code change?

The repo should not force the agent to infer that from scattered hints.

It should declare safe commands explicitly.

That means giving the repo a machine-readable answer to questions like:

what tasks exist
which ones are agent-safe
what they depend on
what runtime mode they use
what they are expected to verify

That is much stronger than asking an agent to "be careful" around shell commands it still has to interpret.

Verification Needs To Be A Path, Not A Suggestion

The second gap is verification.

Many repos still treat verification like a recommendation rather than a declared path.

An instruction file might say:

Make sure everything still works before handoff.

That sounds fine, but it is too loose for reliable agent execution.

A trustworthy repo should be able to say something more concrete:

this is the setup path
this is the finite verification workflow
these tasks are safe to run for routine work
this heavier path exists, but it is not the default

That is the difference between advice and governance.

Without a declared verification path, the agent may:

pass a narrow local check and miss the real gate
run a destructive path unnecessarily
skip a required service-backed test lane
choose the wrong runtime mode
report success without proving repo readiness

Receipts Are The Missing Trust Layer

Even explicit commands and verification paths are still weaker than they should be if nothing records what actually ran.

This is where receipts matter.

A verification receipt is the difference between:

"the agent says it ran the checks"

and:

"the repo can show which task ran, under which contract, in which mode, with what outcome"

That is the trust boundary most agent workflows still lack.

Receipts help answer questions like:

what contract or workflow was selected
what task actually executed
what backend or runtime mode was used
whether setup ran first
whether readiness was reached
what evidence existed when the run failed

Without receipts, review still depends too heavily on:

agent narration
terminal screenshots
CI guesswork
someone remembering what the command probably was

With receipts, verification becomes inspectable.

What A Better Repo Looks Like

A stronger repository keeps these layers distinct:

AGENTS.md for human-written behavioral guidance
a contract surface for tasks, workflows, safe commands, and boundaries
receipts for execution evidence

For example:

AGENTS.MDmd

- Prefer small diffs.
- Do not edit generated files manually.
- Escalate before changing deployment or billing flows.
- Use the declared verification path before handoff.

OTA CONTRACTyaml

agent:  safe_tasks:    - lint    - typecheck    - test  verify_after_changes:    - test tasks:  test:    command:      exe: pnpm      args: [test]    depends_on:      - setup workflows:  verify:    setup:      task: setup    run:      task: test

And then the execution layer should be able to produce evidence rather than only output:

VERIFYbash

ota run test --jsonota receipt --json --archive

The exact tool does not matter as much as the structure:

instructions
safe commands
verification path
receipt

That is the minimum shape of trustworthy agent execution.

Why This Matters More Now

This was already useful when agents mostly suggested edits.

It becomes much more important when agents are expected to:

choose commands
prepare environments
run checks
interpret failures
decide whether work is complete

At that point, the problem is no longer just "does the agent have instructions?"

The problem is whether the repo can expose:

a safe execution surface
a deterministic verification path
evidence that the declared path actually ran

That is a higher bar than AGENTS.md alone can satisfy.

This Is The Stronger Split

If you only need the boundary between instructions and contracts, read:

This post is narrower.

Its claim is not just that AGENTS.md and ota.yaml do different jobs.

Its claim is that even a good instruction file is still not enough unless the repo also declares:

which commands are safe
which verification path is canonical
what receipt counts as evidence

Bottom Line

AGENTS.md is a good start.

But repo instructions alone do not make agent execution safe, reviewable, or trustworthy.

To get there, repositories also need:

explicit safe commands
declared verification workflows
receipts that preserve execution evidence

That is how you move from:

"the agent had guidance"

to:

"the repo had governed execution"

Take action

Get started Open reference Check readiness rules