← Back to blog
Engineering note2026-06-03 09:30 UTC

Pressure-testing Ota on Osiris: making runtime proof and Docker paths honest

How Osiris pressure-testing hardened Ota’s detached runtime semantics, runtime proof behavior, and contract modeling for documented Docker paths.

Overview

Osiris was a useful pressure repo because it looks like a lot of real application repos: a Next.js app, a documented Docker Compose self-host path, no canonical npm test, and a lint surface that exists but is not currently clean enough to claim as the default verification path.

That combination is exactly where readiness tools get exposed. It is easy to make a contract that parses. It is harder to make one that tells the truth.

Before Ota

Before the contract was tightened, the repo had the usual problems:

  • install, dev, build, start, and Docker flows existed, but the readiness truth was split across README.md, DOCKER.md, Dockerfile, docker-compose.yml, and package scripts
  • there was no honest way to claim a default test task, because the repo does not define one
  • lint was visible, but not a truthful default verification gate because the current upstream baseline is red
  • Docker Compose was documented, but that did not automatically mean the Docker engine should be modeled as a repo service

That last point mattered. A lot of tooling gets this wrong and starts inventing fake service models instead of separating host capability from repo runtime.

What we modeled

The final contract does not pretend the repo is cleaner than it is. It models the repo that actually exists today:

  • a host-native Node path for install, typecheck, build, dev, and start
  • an Ota-managed container path for install, verify, dev, and start
  • a detached Docker Compose self-host path that matches the repo docs
  • a bounded verify task that proves typecheck + build, not a fabricated test
  • an explicit empty agent-safe surface, because this slice depends on dependency hydration and runtime-state mutation

The contract also moved Node ownership to toolchains.node, while keeping npm as a standalone tool. That is the honest shape for Osiris: the repo clearly needs Node 22, but it does not declare a first-class Corepack-managed package-manager lane in repo metadata.

What Osiris exposed in Ota

Osiris surfaced two real Ota product gaps.

1. Detached native external-state starters were misclassified

The repo’s documented Docker path is:

COMMANDSbash
docker compose up -d

That is a detached starter. The process exits cleanly while the runtime continues in Docker.

Older Ota behavior treated that like a failed service run if the endpoint was not yet reachable before the starter exited. That is the wrong execution model. A detached external-state starter is not the same thing as a foreground process that crashed.

2. Runtime proof treated warning-only doctor reports as failure

Once the Docker runtime was actually up, Ota could still fail runtime proof because doctor emitted warning-only findings around declared external-state mutation.

That is also the wrong trust model. Warning-only doctor output should still allow runtime proof to succeed when readiness is genuinely proven.

What changed

Osiris drove platform fixes, not repo-local glue.

  • detached native starters that explicitly mutate external state are no longer misclassified as failed service runs just because the starter process exits before readiness is observed
  • runtime proof no longer fails when readiness is proven and the remaining doctor findings are warnings rather than blockers

The Osiris contract also got materially stronger:

  • toolchains.node now owns the Node requirement
  • verify is explicit and honest: typecheck + build
  • the nonexistent test surface is not invented
  • Docker Compose stays modeled as the documented runtime path, not a fake attached dev flow
  • Docker engine availability is expressed as a host precondition check, not as a fake repo service
  • protected paths and agent notes are tighter and clearer

That produces a better contract for both humans and agents. A contributor can see the canonical paths quickly. An agent can see what is safe, what is bounded, and what still depends on external state.

What the matrix now proves

The final matrix does more than check that ota.yaml parses.

It now covers:

  • ota validate
  • ota doctor as a diagnostic surface
  • ota tasks --use
  • ota tasks --safe --use
  • workflow dry-run coverage
  • task dry-run coverage
  • representative native execution
  • representative container execution
  • optional runtime-proof lanes for the heavier runtime surfaces

That is the right level of proof for this repo. It keeps CI meaningful without turning every PR into a slow full-runtime gauntlet.

Why this matters

Osiris was a good reminder that repo readiness is not just about command discovery.

The hard part is telling the truth about execution:

  • what the real verification surface is
  • whether Docker is a repo service or a host capability
  • whether a detached starter succeeded or failed
  • whether runtime proof is actually proving readiness or just reacting to warning noise

Those are not documentation problems. They are infrastructure trust problems.

Pressure-testing Osiris made Ota better in exactly the places that matter most: execution semantics, runtime proof, and honest contract boundaries.

After this work, a new contributor or agent no longer has to infer which Osiris path is canonical from scattered docs and commands. They can inspect the declared surface with ota tasks --use, validate the contract with ota validate, diagnose readiness with ota doctor, and choose the host-native, container, or documented Docker self-host path on purpose.

Links: