← Back to blog
Field note2026-05-29 18:46 UTC

How to Align Local, CI, and Agent Execution

Why repos break when local development, CI, and AI agents do not share the same setup, tasks, and verification path.

Overview

One of the fastest ways to make a repo unreliable is to let local development, CI, and agent execution drift into three different stories.

A developer runs one command locally. CI runs a stricter path in a different environment. The agent reads the README, checks package scripts, opens the workflow file, and has to guess which version of the repo is actually true.

That is where bad automation starts.

The code may be fine. The failure is often operational. The repo never made its intended execution path clear enough for humans, CI, and agents to follow the same logic.

Drift Starts Quietly

Execution drift usually does not arrive as a big design decision. It grows out of convenience.

A project starts with something simple:

COMMANDSbash
npm test

Later, CI gets stricter:

CI FLOWbash
pnpm install --frozen-lockfilepnpm lintpnpm typecheckpnpm test:ci

Then the repo grows again. A service moves into Docker. Integration tests need Postgres. A new environment variable shows up. The README stays broad, CI becomes the only place with the full path, and local scripts explain only part of the story.

Now the repo has multiple sources of operational truth.

Humans feel that as "it works on my machine." Agents feel it as conflicting signals. They run the most obvious command, get a partial pass, and report success against a repo that would still fail in CI.

That is not always an agent reasoning failure. A lot of the time, the repo gave incomplete instructions.

Alignment Does Not Mean Identical Commands

Alignment does not require local development, CI, and agents to run the exact same command sequence.

Local workflows can stay fast. CI can stay strict. Agents can have tighter safety boundaries than maintainers.

What has to stay aligned is the intent.

INTENTtxt
Local:
Run fast checks before and during development.

CI:
Run the full verification path before merge.

Agent:
Run declared safe tasks, report what passed, and stop cleanly at boundaries.

The problem starts when each path is invented separately instead of declared from the same operating model.

What To Align First

The best place to start is with the parts of the repo that create the most confusion and the most false confidence.

1. Setup

There should be one clear setup path for the repo.

For a Python service, that might be:

PYTHON SETUPbash
poetry installdocker compose up -d postgrespoetry run alembic upgrade head

For a Go service, it might be:

GO SETUPbash
go mod downloaddocker compose up -d redisgo test ./...

The stack is not the point. Clarity is. Setup should not be split across README prose, CI YAML, shell history, and one teammate's memory.

2. Tasks

Repos should expose common tasks with clear names.

TASKStxt
setup
test
test:integration
lint
typecheck
build
dev

People and agents both work better when task names explain intent. If someone has to inspect every script to figure out what is quick, complete, safe, or destructive, the repo is already too ambiguous.

3. Verification

Verification is where drift becomes expensive.

A repo should separate quick feedback from full verification:

VERIFICATIONtxt
Quick check:
pytest tests/unit

Full verification:
pytest --cov
ruff check .
mypy .

Without that distinction, contributors stop at the fast check and assume they are done while CI still holds the real standard.

4. Services

Services should never be implied.

If a task needs Postgres, Redis, Elasticsearch, a queue, or a local emulator, the repo should say so directly. It should also say whether that service is started with Docker Compose, expected from the host machine, or provided elsewhere.

This matters because missing services often look like application bugs. An agent can lose time chasing the wrong problem when the real issue is that the repo never declared its dependencies clearly.

5. Safety Boundaries

Agents need a clean line between what is safe to run and what requires review.

Safe tasks usually look like:

TASKStxt
test
lint
typecheck
build

Riskier tasks usually look like:

TASKStxt
deploy
publish
db:reset
terraform apply

If that boundary is not explicit, the agent is left inferring safety from command names. That is not a serious operating model.

Where Ota Fits

This is the problem Ota is meant to solve.

Ota turns repo execution from scattered instructions into a declared contract. Instead of hoping the README, scripts, and CI happen to agree, the repo can describe its readiness model in ota.yaml.

That contract can declare:

  • what the repo needs
  • how setup happens
  • which tasks exist
  • which workflow is expected
  • what counts as readiness
  • which commands are safe for agents

The value is not just that Ota can run commands. The value is that humans, CI, and agents can all operate from the same declared path.

For example:

  • ota doctor checks whether the repo is actually ready and points to the blocker
  • ota validate checks whether the contract is valid
  • ota up prepares the repo from the declared contract
  • ota run <task> runs a declared task instead of forcing people or agents to reverse-engineer the right command

The README can still explain the project. CI can still enforce the standard. Ota gives them a shared execution contract so they stop drifting apart.

A Simple Alignment Check

Before adding more setup prose to a README, it is worth asking:

ALIGNMENT CHECKtxt
Can a new contributor find the correct setup path?
Does CI prove the same core tasks the repo tells people to run?
Is the difference between quick checks and full verification explicit?
Are required services declared clearly?
Are dangerous commands separated from safe ones?
Can an AI agent tell what it is allowed to run?
Is there one place that explains how execution is supposed to work?

If the answer is no, the repo does not mainly need more documentation. It needs better alignment.

Closing

Local development, CI, and AI agents usually fail for the same boring reason: the repo never made its intended execution path explicit enough.

That gets more expensive as teams depend more on automation and AI-assisted development.

A ready repo should tell humans what to run, tell CI what to enforce, and tell agents what is safe.

That is alignment.

Not identical environments.

A shared path.