Recommended next

Runtime Surfaces Workflows Execution Topology Contract Reference Command Reference Environment Model

Why readiness has layers

Readiness is not one boolean in real repos.

A repo can have dependencies installed, a process running, a port open, an app health route passing, and still not be ready for the workflow a developer actually asked for.

Ota keeps those truths separate so ota doctor, ota check, ota up, CI, and agents can explain what was proven instead of treating every signal as the same thing.

Process-started is not the same as listener-reachable. A process can be alive and still unusable.
Listener-reachable is not the same as app-ready. A socket can open before the workload is healthy.
App-ready is not the same as host endpoint confirmed. Internal success does not guarantee external reachability.
Repo-ready and workflow-ready are different scopes. A repo can be good for one path and broken for another.
Workflow-ready means the selected operational path has its own setup, services, runtime, and checks all satisfied.

The model

Use the narrowest truthful readiness owner for each signal.

Do not duplicate the same health truth across tasks, services, checks, and workflows unless a current command needs a bridge.

The model

Task runtime readiness

A long-running task proves its own workload is usable.

When to use it

Use this for app servers, dev servers, workers, or any task with a declared runtime and listener.

The model

Runtime surfaces

A reusable runtime endpoint definition keeps listener, readiness, and workflow exposure truth attached to one named app surface.

When to use it

Use this when the same backend, frontend, API, or UI endpoint appears across multiple tasks or workflows.

The model

Service readiness

A repo dependency proves the infrastructure it owns is reachable from the context that matters.

When to use it

Use this for databases, queues, Compose services, or workspace-produced services declared under services.

The model

Checks

A named command-level readiness gate runs without becoming part of a business task.

When to use it

Use this for CI, ota check, workflow gates, or repo-specific probes that do not belong to one runtime.

The model

Workflow readiness

A named operational path chooses which setup, services, run task, and checks define ready for that path.

When to use it

Use this when a repo has app, backend, frontend, worker, or CI paths that should not all share one vague repo-ready state.

Command behavior

ota tasks --workflow <name> is a read-only discovery command; it shows selected workflow pieces without executing them.
ota doctor diagnoses the selected repo or workflow and emits blockers, warnings, and next actions in order.
ota check runs configured checks without task execution, so it is the cheapest readiness gate.
ota up prepares the selected workflow by starting services, running setup when needed, activating runtime tasks, and re-checking readiness.
ota execution plan renders the execution boundary before mutation, so you can inspect effects safely.
Workspace commands honor repos.<name>.workflow when the workspace pins one path per repo.
JSON output should expose the effective workflow and task whenever workflow-aware readiness is selected.

Runtime readiness

Runtime readiness belongs on the task when the thing being proved is the workload started by that task.

This keeps app health close to the command and listener that own it.

Use tcp when accepting connections is the only readiness fact you need.
Use http when the app has a direct health route that proves true readiness.
Bind readiness to the listener that represents the actual workload surface.
Keep business endpoints out of readiness unless a business route is the real health boundary.

Task runtime readinessyaml

tasks:  dev:    context: app    run: next dev    runtime:      kind: service      readiness:        kind: http        listener: http        path: /      listeners:        http:          protocol: http          bind:            address: 0.0.0.0            port:              mode: fixed              value: 3000          project:            host:              address: 127.0.0.1              port:                mode: fixed                value: 3000              path: /

Service readiness

Service readiness belongs under services when the repo dependency is not the same thing as the app task.

Use this for managed local infrastructure or producer-owned workspace services instead of hiding dependency checks inside setup scripts.

Use services.<name>.readiness when infrastructure readiness is owned by a service declaration.
Use from to specify which execution context owns the endpoint truth being tested.
Use structured tcp/http when the contract can express the probe directly.
Keep legacy run readiness only as a compatibility fallback when the current probe truth cannot yet be expressed structurally.

Structured service readinessyaml

services:  postgres:    required: true    manager:      kind: compose      name: local      file: compose.yaml      service: postgres    endpoints:      app:        address: postgres        port: 5432    readiness:      from: app      kind: tcp      timeout: 3s      retries: 5

Checks

Checks are the named readiness gates that ota check, ota doctor, workflows, and CI can reuse today.

Use checks when the readiness signal should be explicit and machine-repeatable, but is not owned by one runtime or service declaration.

Choose checks[].name as a stable key that is safe for CI history and alerting.
Use kind: precondition for required repo or host facts before any runtime execution.
Use kind: health for app or runtime validation checks.
Use kind: file or kind: env when ota already ships the deterministic assertion surface; reserve shell run checks for logic that still does not fit the structured lanes.
Keep file checks repo-scoped by default; add scope: workspace only when the contract truth really depends on a sibling relative input outside the repo subtree.
checks[].kind: env is the first-class dotenv-backed assertion surface for repo-relative env files and resolved env values. Use it for state: present|missing, exact equals, not_equals, host rules such as host.allowed, URL/DSN host rules such as url_host.allowed, or policy: not_loopback instead of shell grep / findstr glue.
Use severity to define whether a failure is blocking (error) or advisory (warn/info).
Use run as the portable escape hatch when logic cannot be modeled as a built-in probe.

Command-backed checksyaml

checks:  - name: repo-config-valid    kind: precondition    severity: error    run: ./scripts/validate-config.sh    timeout: 10  - name: backend-ready    kind: health    severity: error    run: node -e "fetch('http://127.0.0.1:5678/healthz/readiness').then((res)=>process.exit(res.ok?0:1)).catch(()=>process.exit(1))"    timeout: 10000

Workflow readiness

Workflow readiness answers ready for what.

Use it when the repo has more than one valid front door and plain repo-wide readiness would either be too broad or too vague.

Use workflows.default when one workflow should be the baseline for ota doctor, ota check, ota up, and ota execution plan.
Use --workflow <name> to scope diagnosis and preparation to a specific operational path.
Use readiness.checks to bind exactly the check set for that workflow.
Use services.required to scope the service graph for that workflow path only.
A backend workflow should not fail because a frontend-only readiness check is broken.

Workflow-scoped readinessyaml

workflows:  default: app  app:    setup:      task: setup    run:      task: dev    services:      required:        - postgres    readiness:      checks:        - backend-ready    exposes:      - http://127.0.0.1:5678

TCP versus HTTP readiness

Use tcp when readiness is correctly represented by listener acceptability alone.
Use http when one dedicated health route proves readiness better than a raw port.
HTTP readiness supports method, request headers, success.status, and body.contains, plus timing controls like interval, timeout, retries, and start_period.
Pick the smallest truthful probe, not the easiest to write.

HTTP task readinessyaml

runtime:  kind: service  readiness:    kind: http    listener: http    method: GET    path: /health    headers:      Accept: application/json    success:      status: [200]    body:      contains: '"status":"UP"'    interval: 5s    timeout: 3s    retries: 5    start_period: 10s

Listener-only TCP readinessyaml

runtime:  kind: service  readiness:    kind: tcp    listener: http

How to choose the readiness owner

Use tasks.<name>.runtime.readiness when the task owns the app process.
Use services.<name>.readiness when a service owns the dependency.
Use checks for cross-cutting or CI-only validation that should run without starting a task.
Use workflows.<name>.readiness for path-based readiness decisions, not shell-script branching.
Use tcp when listener reachability is sufficient.
Use http when one health route is a stronger readiness signal than raw socket state.
Use body.contains only when status code is not enough to prove truth.
Use start_period, interval, timeout, and retries to define bounded, predictable waiting behavior.

Projected versus confirmed endpoints

A projected endpoint is the URL ota resolves from the contract. A confirmed endpoint is one ota itself has actually reached.

project.host defines the declared host-facing URL shape
--host-port <port> overrides that one run's projected host port without changing the internal bind port
framework-local logs such as Local: http://localhost:3000 are not treated as host confirmation
stream-mode External: and Internal: lines only print after ota confirms the projected endpoint itself

Why internal readiness can still differ from host confirmation

a container can answer on its internal listener while host publication is still broken or unreachable
a remote or VM-backed Docker context can publish ports inside the VM while macOS localhost still cannot reach them
that is why ota separates projected endpoint truth from confirmed endpoint truth

VM-backed container boundaries

One real-world example is Colima on macOS: Docker can report a published host port while the port is only reachable inside the container VM.

in that case the app may be ready internally and even reachable inside the VM
macOS 127.0.0.1:<port> can still fail
ota keeps the confirmation banner withheld because host reachability was not actually proven
the interrupted pre-confirmation path now calls this out explicitly when Docker is running through the colima context

Verify from inside Colimabash

colima ssh -- curl http://127.0.0.1:3001/

How to author readiness well

declare readiness once at the layer that owns the truth
bind runtime readiness to the listener that actually represents the service surface you care about
bind service readiness to the context view that must consume that service
use checks for named gates that CI, workflow readiness, or humans need to run directly
scope multi-front-door repos with workflows instead of making every check global
treat omitted readiness as a conscious downgrade to listener-reachability or command-success semantics
debug endpoint problems by separating internal app truth from host-forwarding truth first

Reusable probes

Reusable readiness probes are the structured way to declare a transport-level readiness target once.

Choose reusable probes when one transport-level readiness target is needed in checks, workflow readiness, and/or task readiness.

Decision rule:

Use literal url probes when the endpoint is external, third-party, or cannot be modeled by Ota topology.

Use topology-derived probes when the endpoint is owned by a declared Ota task listener or service endpoint.

declare reusable probes under readiness.probes
use url when the endpoint is external, third-party, or intentionally outside the repo contract
use topology-derived targets once the endpoint is declared by an Ota task listener or service endpoint
task-target probes default to the invoking ota command plane, where target.address_view: host is the right published-endpoint shape
use target.observer.kind: task plus target.observer.task when the probe should resolve host, topology, or internal exactly as one named task sees it from its own backend plane
service-target probes can resolve http or tcp readiness directly from one declared service endpoint
use target.observer only on task-target probes; service-target probes reject it
HTTP probes now use the same request-shaping surface as runtime/service readiness: method, headers, success.status, and body.contains
omit both expect_status and success.status for the normal default 200, use expect_status when one shorthand status is clearer, and use success.status when you want multiple accepted statuses
use checks[].probe when the probe should appear as a named check gate
use workflows.<name>.readiness.probes when the selected workflow should prove that probe directly
use tasks.<name>.runtime.readiness.probe or services.<name>.readiness.probe when runtime or service-manager readiness should reuse the same transport and timeout contract without duplicating it inline
keep checks[].run as the portable escape hatch for checks that cannot be expressed as a built-in probe
avoid duplicating the same HTTP readiness URL across inline shell checks, workflow readiness, task runtime readiness, or service readiness when one reusable probe can own it once

Literal URL probe for external or third-party endpoints

Use a literal url when the endpoint is not owned by an Ota task listener or service endpoint.

This keeps external services, third-party microservices, gateways, and quick-start probes easy to model without forcing fake topology into the contract.

ota.yamlyaml

readiness:  probes:    backend-ready:      kind: http      url: http://127.0.0.1:5678/healthz/readiness      method: GET      headers:        x-ota-probe: external      success:        status: [200]      timeout: 10000 checks:  - name: backend-ready    kind: health    severity: error    probe: backend-ready workflows:  backend:    readiness:      probes:        - backend-ready

Topology-derived probe for Ota-owned endpoints

Use a topology-derived target when the endpoint already belongs to a declared task listener or service endpoint.

That lets readiness follow the canonical host, port, listener, and endpoint truth instead of duplicating a URL that can drift.

Without an observer, task-target probes resolve from the invoking command plane and execution-topology JSON reports that explicitly as target.resolution_plane: command_host.

When one caller-relative view matters, declare target.observer.kind: task plus target.observer.task so the probe resolves exactly as that task sees the producer from its effective backend plane.

ota.yamlyaml

readiness:  probes:    backend-ready:      kind: http      target:        kind: task        name: backend        listener: backend        address_view: topology        observer:          kind: task          task: sandbox      method: GET      path: /healthz/readiness      success:        status: [200]      timeout: 10000    postgres-ready:      kind: tcp      target:        kind: service        name: postgres        endpoint: app      timeout: 10000