Pressure-testing Ota on Osiris: making runtime proof and Docker paths honest
How Osiris pressure-testing hardened Ota’s detached runtime semantics, runtime proof behavior, and contract modeling for documented Docker paths.
Overview
Osiris was a useful pressure repo because it looks like a lot of real application repos: a Next.js app, a documented Docker Compose self-host path, no canonical npm test, and a lint surface that exists but is not currently clean enough to claim as the default verification path.
That combination is exactly where readiness tools get exposed. It is easy to make a contract that parses. It is harder to make one that tells the truth.
Before Ota
Before the contract was tightened, the repo had the usual problems:
- install, dev, build, start, and Docker flows existed, but the readiness truth was split across
README.md,DOCKER.md,Dockerfile,docker-compose.yml, and package scripts - there was no honest way to claim a default
testtask, because the repo does not define one - lint was visible, but not a truthful default verification gate because the current upstream baseline is red
- Docker Compose was documented, but that did not automatically mean the Docker engine should be modeled as a repo
service
That last point mattered. A lot of tooling gets this wrong and starts inventing fake service models instead of separating host capability from repo runtime.
What we modeled
The final contract does not pretend the repo is cleaner than it is. It models the repo that actually exists today:
- a host-native Node path for
install,typecheck,build,dev, andstart - an Ota-managed container path for
install,verify,dev, andstart - a detached Docker Compose self-host path that matches the repo docs
- a bounded
verifytask that provestypecheck + build, not a fabricatedtest - an explicit empty agent-safe surface, because this slice depends on dependency hydration and runtime-state mutation
The contract also moved Node ownership to toolchains.node, while keeping npm as a standalone tool. That is the honest shape for Osiris: the repo clearly needs Node 22, but it does not declare a first-class Corepack-managed package-manager lane in repo metadata.
What Osiris exposed in Ota
Osiris surfaced two real Ota product gaps.
1. Detached native external-state starters were misclassified
The repo’s documented Docker path is:
docker compose up -dThat is a detached starter. The process exits cleanly while the runtime continues in Docker.
Older Ota behavior treated that like a failed service run if the endpoint was not yet reachable before the starter exited. That is the wrong execution model. A detached external-state starter is not the same thing as a foreground process that crashed.
2. Runtime proof treated warning-only doctor reports as failure
Once the Docker runtime was actually up, Ota could still fail runtime proof because doctor emitted warning-only findings around declared external-state mutation.
That is also the wrong trust model. Warning-only doctor output should still allow runtime proof to succeed when readiness is genuinely proven.
What changed
Osiris drove platform fixes, not repo-local glue.
- detached native starters that explicitly mutate external state are no longer misclassified as failed service runs just because the starter process exits before readiness is observed
- runtime proof no longer fails when readiness is proven and the remaining doctor findings are warnings rather than blockers
The Osiris contract also got materially stronger:
toolchains.nodenow owns the Node requirementverifyis explicit and honest:typecheck + build- the nonexistent
testsurface is not invented - Docker Compose stays modeled as the documented runtime path, not a fake attached dev flow
- Docker engine availability is expressed as a host precondition check, not as a fake repo
service - protected paths and agent notes are tighter and clearer
That produces a better contract for both humans and agents. A contributor can see the canonical paths quickly. An agent can see what is safe, what is bounded, and what still depends on external state.
What the matrix now proves
The final matrix does more than check that ota.yaml parses.
It now covers:
ota validateota doctoras a diagnostic surfaceota tasks --useota tasks --safe --use- workflow dry-run coverage
- task dry-run coverage
- representative native execution
- representative container execution
- optional runtime-proof lanes for the heavier runtime surfaces
That is the right level of proof for this repo. It keeps CI meaningful without turning every PR into a slow full-runtime gauntlet.
Why this matters
Osiris was a good reminder that repo readiness is not just about command discovery.
The hard part is telling the truth about execution:
- what the real verification surface is
- whether Docker is a repo service or a host capability
- whether a detached starter succeeded or failed
- whether runtime proof is actually proving readiness or just reacting to warning noise
Those are not documentation problems. They are infrastructure trust problems.
Pressure-testing Osiris made Ota better in exactly the places that matter most: execution semantics, runtime proof, and honest contract boundaries.
After this work, a new contributor or agent no longer has to infer which Osiris path is canonical from scattered docs and commands. They can inspect the declared surface with ota tasks --use, validate the contract with ota validate, diagnose readiness with ota doctor, and choose the host-native, container, or documented Docker self-host path on purpose.
Links:
- Contract: osiris
ota.yaml - Matrix workflow: test-ota-contract-matrix.yml
- Earlier green matrix run: #26851755987
- Current green matrix run: #26874635989
Take action