Field note2026-06-06 21:13 UTC

AI Agent Safety Needs Stop Signs, Not Just Instructions

AI agents need more than repo instructions. They need explicit stopping rules for secrets, unsafe commands, protected paths, external services, and approval boundaries. Ota makes those boundaries enforceable at the repo level.

ai-agents agent-safety execution-governance repo-readiness

Adamma Mbonu

Overview

AI agents do not only need better instructions.

They need stop signs.

That is one of the clearest reasons Ota exists as software execution governance for humans and AI agents. A repo should not merely tell an agent what it can try. It should declare what the agent must not do, when it must stop, and what requires human approval.

Prompts and AGENTS.md files are useful. They give agents context: how the project is organized, what style to follow, how to summarize changes, and which areas need caution.

But advice is not a boundary.

An instruction says:

Be careful with database commands.

A stop sign says:

Do not run destructive database commands unless explicitly approved.

An instruction says:

Avoid editing generated files.

A stop sign says:

These paths are protected. Stop if the requested edit falls outside the writable boundary.

That difference matters because modern agents are no longer passive readers. They inspect repos, choose commands, edit files, run checks, interpret failures, and report completion.

If the repo gives them only guidance, they still have to infer the boundary.

Ota's position is sharper: agent execution should not depend on inference. It should be governed by the repo.

Instructions Tell Agents What to Attempt

Most agent guidance is written as advice.

It says:

follow the existing style
prefer small changes
run tests before finishing
avoid touching generated files
do not expose secrets
explain what changed

That helps. It makes agents less generic and more aware of the repo they are working inside.

But it still leaves the dangerous questions open:

Which tests should the agent run?
Which commands are allowed?
Which files are generated?
Which services require approval?
Which failures mean "fix the code" and which mean "stop and ask"?
Which paths are out of bounds?

A capable agent may make reasonable guesses.

But reasonable guesses are not governance.

For low-risk editing, guidance may be enough. For repo execution, CI, automation, and agentic development, the repo needs something stronger.

A Repo-Shaped Failure

Imagine an agent working inside a repo with:

an AGENTS.md file that says "be careful with database commands"
a README that points to make test
a hidden db:reset helper task used by maintainers
a .env.local file that should never be edited automatically
a CI path that expects seeded data from a managed staging database

The agent runs the obvious local test path, sees failures caused by missing data, searches for related commands, finds db:reset, and decides it is probably the right recovery step.

Now the agent has crossed three boundaries at once:

it chose a risky mutating command
it treated an environment dependency like a code problem
it moved toward protected local state to make the command pass

Nothing about that behavior requires a reckless model.

It only requires a repo that never declared where the stop signs were.

That is the point. The failure is not just "the agent made a bad choice." The repo made a bad boundary invisible.

Stop Signs Define When Not to Continue

A stop sign is not a suggestion.

It is a boundary.

In a repo, stopping rules should cover at least five areas.

1. Secrets and credentials

An agent should not invent secrets, request private values indirectly, or edit sensitive environment files just to make a task pass.

If a command needs an API key, database password, cloud token, or private credential, the correct behavior is not improvisation.

The correct behavior is to stop and report the blocker.

2. External services

Some tasks depend on systems outside the repo: cloud infrastructure, managed databases, payment providers, queues, object storage, or production-like services.

If those services are unavailable, the agent should not patch code around the failure.

It should identify the missing dependency and stop.

3. Unsafe mutation

Some commands change state.

deploy
publish
db:reset
terraform apply

These are not cousins of test, lint, or build.

If a task can mutate external state, delete data, publish packages, or affect infrastructure, the repo should not outsource that decision to the agent's confidence.

That boundary should be declared.

4. Protected paths

Agents need to know where they can work.

Source files and tests may be open. Generated files, migrations, lockfiles, production config, and environment files may need review or approval.

This is not about slowing the agent down.

It is about preventing quiet damage in files that carry operational weight.

5. Verification limits

Agents also need to know when verification is finite.

A long-running dev server is not a verification result. A watch mode is not a handoff signal. A task that never terminates is not the same as a bounded check.

Agent-safe tasks need finite verification paths: run, finish, report status.

Without that, the agent may wait indefinitely, stop too early, or report success without a meaningful result.

This Is Execution Governance

This is bigger than prompt quality.

If an agent runs a risky command, edits a protected file, or treats missing credentials as a code problem, the issue is not only that the agent made a poor choice.

The repo failed to govern execution.

Software execution governance means the repo can declare:

what it needs
how it should be prepared
what can be executed
what requires approval
where agents can write
when verification is complete
when execution must stop

That is the frame Ota is built around.

Not better setup docs.

Not another task runner.

Ota is the contract-first way to make execution boundaries explicit for humans, CI, automation, and AI agents.

Why Ota, Not Just Better Prompts

This is where the line gets important.

You can write a better prompt. You can improve AGENTS.md. You can add more warnings to the README.

None of that turns advice into an enforceable repo boundary.

That is why I would not trust "be careful" as an agent-safety strategy.

Serious repos need machine-readable stop signs:

declared safe tasks
declared protected paths
declared readiness blockers
declared approval boundaries
declared finite verification paths

That is what Ota gives you.

It moves the repo from:

The agent should probably avoid this.

to:

The repo explicitly declares this is out of bounds, incomplete, or approval-gated.

That is a much stronger operating model, and it is the one teams will need as agents move from suggestion into execution.

How Ota Makes Stop Signs Explicit

In an Ota-backed repo, stopping rules do not have to live only in prose.

The contract can declare safe tasks, verification tasks, writable paths, protected paths, setup requirements, and readiness blockers.

That gives agents a governed operating model:

If the task is declared safe, proceed.
If setup is required, prepare from the contract.
If the contract is invalid, stop.
If secrets or credentials are missing, stop.
If the requested edit is outside writable paths, stop.
If the task mutates external state without approval, stop.
If verification is complete, report the result.

That is stronger than telling an agent to be careful.

Ota's agent quickstart follows this same principle: agents should prefer repo-local contracts when they exist, execute declared safe tasks, parse JSON output instead of scraping terminal prose, and stop when blockers involve secrets, credentials, external services, unsafe mutation, or paths outside declared boundaries.

The command surface supports that model:

ota doctor checks readiness and surfaces blockers before work begins
ota validate checks whether the contract itself is usable
ota tasks shows what work the repo has declared
ota up --dry-run previews setup before changing the environment
ota run <task> --json runs declared work and returns stable status for automation

The point is not that every agent action needs ceremony.

The point is that dangerous ambiguity should be removed before execution happens.

AGENTS.md Still Matters

This does not make AGENTS.md useless.

It means AGENTS.md should do what prose does best: explain context.

Use it for style, conventions, architectural notes, review expectations, and collaboration preferences.

Use Ota for the execution boundary.

A clean split looks like this:

AGENTS.md: how the agent should behave
ota.yaml: what the repo allows, requires, verifies, and refuses

One gives the agent context.

The other governs the repo.

Together, they produce a better operator: one that understands the project and knows where the guardrails are.

Stop Signs Build Trust

Teams do not trust agents because agents sound confident.

They trust agents when the repo constrains what the agent can do, makes the approved path obvious, and produces evidence for what happened.

A good stop sign does not make agents less useful.

It makes them dependable.

It tells the agent:

Move quickly here.
Slow down here.
Stop here.
Ask here.
Report this.
Do not guess.

That is the behavior serious teams need as AI agents move from code suggestion into repo execution.

Conclusion

AI agents need instructions.

But instructions alone are not enough.

A repo that only tells agents what to do still leaves too much room for unsafe interpretation. The next layer is stopping rules: clear boundaries for secrets, external services, unsafe mutation, protected paths, and finite verification.

That is why Ota's contract-first model matters.

It turns agent safety from advice into execution governance.

The future of AI-assisted development will not be won by repos that merely prompt agents better.

It will be won by repos that know when agents should stop.

Take action

Get started Open reference Check readiness rules