Operate
Ota Readiness Model
How ota decides whether a repo is prepared, a service is reachable, an app is ready, and a workflow is operational.
Recommended next
Why readiness has layers
Readiness is not one boolean in real repos.
A repo can have dependencies installed, a process running, a port open, an app health route passing, and still not be ready for the workflow a developer actually asked for.
Ota keeps those truths separate so ota doctor, ota check, ota up, CI, and agents can explain what was proven instead of treating every signal as the same thing.
- Process-started is not the same as listener-reachable. A process can be alive and still unusable.
- Listener-reachable is not the same as app-ready. A socket can open before the workload is healthy.
- App-ready is not the same as host endpoint confirmed. Internal success does not guarantee external reachability.
- Repo-ready and workflow-ready are different scopes. A repo can be good for one path and broken for another.
- Workflow-ready means the selected operational path has its own setup, services, runtime, and checks all satisfied.
The model
Use the narrowest truthful readiness owner for each signal.
Do not duplicate the same health truth across tasks, services, checks, and workflows unless a current command needs a bridge.
The model
Task runtime readiness
A long-running task proves its own workload is usable.
The model
Runtime surfaces
A reusable runtime endpoint definition keeps listener, readiness, and workflow exposure truth attached to one named app surface.
The model
Service readiness
A repo dependency proves the infrastructure it owns is reachable from the context that matters.
The model
Checks
A named command-level readiness gate runs without becoming part of a business task.
The model
Workflow readiness
A named operational path chooses which setup, services, run task, and checks define ready for that path.
Command behavior
ota tasks --workflow <name>is a read-only discovery command; it shows selected workflow pieces without executing them.ota doctordiagnoses the selected repo or workflow and emits blockers, warnings, and next actions in order.ota checkruns configured checks without task execution, so it is the cheapest readiness gate.ota upprepares the selected workflow by starting services, running setup when needed, activating runtime tasks, and re-checking readiness.ota execution planrenders the execution boundary before mutation, so you can inspect effects safely.- Workspace commands honor
repos.<name>.workflowwhen the workspace pins one path per repo. - JSON output should expose the effective workflow and task whenever workflow-aware readiness is selected.
Runtime readiness
Runtime readiness belongs on the task when the thing being proved is the workload started by that task.
This keeps app health close to the command and listener that own it.
- Use
tcpwhen accepting connections is the only readiness fact you need. - Use
httpwhen the app has a direct health route that proves true readiness. - Bind readiness to the listener that represents the actual workload surface.
- Keep business endpoints out of readiness unless a business route is the real health boundary.
tasks: dev: context: app run: next dev runtime: kind: service readiness: kind: http listener: http path: / listeners: http: protocol: http bind: address: 0.0.0.0 port: mode: fixed value: 3000 project: host: address: 127.0.0.1 port: mode: fixed value: 3000 path: /Service readiness
Service readiness belongs under services when the repo dependency is not the same thing as the app task.
Use this for managed local infrastructure or producer-owned workspace services instead of hiding dependency checks inside setup scripts.
- Use
services.<name>.readinesswhen infrastructure readiness is owned by a service declaration. - Use
fromto specify which execution context owns the endpoint truth being tested. - Use structured
tcp/httpwhen the contract can express the probe directly. - Use legacy
runreadiness only when service behavior needs repo-specific shell commands.
services: postgres: required: true manager: kind: compose name: local file: compose.yaml service: postgres endpoints: app: address: postgres port: 5432 readiness: from: app kind: tcp timeout: 3s retries: 5Checks
Checks are the named readiness gates that ota check, ota doctor, workflows, and CI can reuse today.
Use checks when the readiness signal should be explicit and machine-repeatable, but is not owned by one runtime or service declaration.
- Choose
checks[].nameas a stable key that is safe for CI history and alerting. - Use
kind: preconditionfor required repo or host facts before any runtime execution. - Use
kind: healthfor app or runtime validation checks. - Use
kind: fileorkind: envwhen ota already ships the deterministic assertion surface; reserve shellrunchecks for logic that still does not fit the structured lanes. checks[].kind: envis the first-class dotenv-backed assertion surface for repo-relative env files and resolved env values. Use it forstate: present|missing, exactequals,not_equals, host rules such ashost.allowed, URL/DSN host rules such asurl_host.allowed, orpolicy: not_loopbackinstead of shellgrep/findstrglue.- Use
severityto define whether a failure is blocking (error) or advisory (warn/info). - Use
runas the portable escape hatch when logic cannot be modeled as a built-in probe.
checks: - name: repo-config-valid kind: precondition severity: error run: ./scripts/validate-config.sh timeout: 10 - name: backend-ready kind: health severity: error run: node -e "fetch('http://127.0.0.1:5678/healthz/readiness').then((res)=>process.exit(res.ok?0:1)).catch(()=>process.exit(1))" timeout: 10000Workflow readiness
Workflow readiness answers ready for what.
Use it when the repo has more than one valid front door and plain repo-wide readiness would either be too broad or too vague.
- Use
workflows.defaultwhen one workflow should be the baseline forota doctor,ota check,ota up, andota execution plan. - Use
--workflow <name>to scope diagnosis and preparation to a specific operational path. - Use
readiness.checksto bind exactly the check set for that workflow. - Use
services.requiredto scope the service graph for that workflow path only. - A backend workflow should not fail because a frontend-only readiness check is broken.
workflows: default: app app: setup: task: setup run: task: dev services: required: - postgres readiness: checks: - backend-ready exposes: - http://127.0.0.1:5678TCP versus HTTP readiness
- Use
tcpwhen readiness is correctly represented by listener acceptability alone. - Use
httpwhen one dedicated health route proves readiness better than a raw port. - HTTP readiness supports
method, requestheaders,success.status, andbody.contains, plus timing controls likeinterval,timeout,retries, andstart_period. - Pick the smallest truthful probe, not the easiest to write.
runtime: kind: service readiness: kind: http listener: http method: GET path: /health headers: Accept: application/json success: status: [200] body: contains: '"status":"UP"' interval: 5s timeout: 3s retries: 5 start_period: 10sruntime: kind: service readiness: kind: tcp listener: httpHow to choose the readiness owner
- Use
tasks.<name>.runtime.readinesswhen the task owns the app process. - Use
services.<name>.readinesswhen a service owns the dependency. - Use
checksfor cross-cutting or CI-only validation that should run without starting a task. - Use
workflows.<name>.readinessfor path-based readiness decisions, not shell-script branching. - Use
tcpwhen listener reachability is sufficient. - Use
httpwhen one health route is a stronger readiness signal than raw socket state. - Use
body.containsonly when status code is not enough to prove truth. - Use
start_period,interval,timeout, andretriesto define bounded, predictable waiting behavior.
Projected versus confirmed endpoints
A projected endpoint is the URL ota resolves from the contract. A confirmed endpoint is one ota itself has actually reached.
project.hostdefines the declared host-facing URL shape--host-port <port>overrides that one run's projected host port without changing the internal bind port- framework-local logs such as
Local: http://localhost:3000are not treated as host confirmation - stream-mode
External:andInternal:lines only print after ota confirms the projected endpoint itself
Why internal readiness can still differ from host confirmation
- a container can answer on its internal listener while host publication is still broken or unreachable
- a remote or VM-backed Docker context can publish ports inside the VM while macOS localhost still cannot reach them
- that is why ota separates
projected endpointtruth fromconfirmed endpointtruth
VM-backed container boundaries
One real-world example is Colima on macOS: Docker can report a published host port while the port is only reachable inside the container VM.
- in that case the app may be ready internally and even reachable inside the VM
- macOS
127.0.0.1:<port>can still fail - ota keeps the confirmation banner withheld because host reachability was not actually proven
- the interrupted pre-confirmation path now calls this out explicitly when Docker is running through the
colimacontext
colima ssh -- curl http://127.0.0.1:3001/Reusable probes
Reusable readiness probes are the structured way to declare a transport-level readiness target once.
Choose reusable probes when one transport-level readiness target is needed in checks, workflow readiness, and/or task readiness.
Decision rule:
Use literal url probes when the endpoint is external, third-party, or cannot be modeled by Ota topology.
Use topology-derived probes when the endpoint is owned by a declared Ota task listener or service endpoint.
- declare reusable probes under
readiness.probes - use
urlwhen the endpoint is external, third-party, or intentionally outside the repo contract - use topology-derived targets once the endpoint is declared by an Ota task listener or service endpoint
- task-target probes default to the invoking ota command plane, where
target.address_view: hostis the right published-endpoint shape - use
target.observer.kind: taskplustarget.observer.taskwhen the probe should resolvehost,topology, orinternalexactly as one named task sees it from its own backend plane - service-target probes can resolve
httportcpreadiness directly from one declared service endpoint - use
target.observeronly on task-target probes; service-target probes reject it - HTTP probes now use the same request-shaping surface as runtime/service readiness:
method,headers,success.status, andbody.contains - omit both
expect_statusandsuccess.statusfor the normal default200, useexpect_statuswhen one shorthand status is clearer, and usesuccess.statuswhen you want multiple accepted statuses - use
checks[].probewhen the probe should appear as a named check gate - use
workflows.<name>.readiness.probeswhen the selected workflow should prove that probe directly - use
tasks.<name>.runtime.readiness.probeorservices.<name>.readiness.probewhen runtime or service-manager readiness should reuse the same transport and timeout contract without duplicating it inline - keep
checks[].runas the portable escape hatch for checks that cannot be expressed as a built-in probe - avoid duplicating the same HTTP readiness URL across inline shell checks, workflow readiness, task runtime readiness, or service readiness when one reusable probe can own it once
Literal URL probe for external or third-party endpoints
Use a literal url when the endpoint is not owned by an Ota task listener or service endpoint.
This keeps external services, third-party microservices, gateways, and quick-start probes easy to model without forcing fake topology into the contract.
readiness: probes: backend-ready: kind: http url: http://127.0.0.1:5678/healthz/readiness method: GET headers: x-ota-probe: external success: status: [200] timeout: 10000 checks: - name: backend-ready kind: health severity: error probe: backend-ready workflows: backend: readiness: probes: - backend-readyTopology-derived probe for Ota-owned endpoints
Use a topology-derived target when the endpoint already belongs to a declared task listener or service endpoint.
That lets readiness follow the canonical host, port, listener, and endpoint truth instead of duplicating a URL that can drift.
Without an observer, task-target probes resolve from the invoking command plane and execution-topology JSON reports that explicitly as target.resolution_plane: command_host.
When one caller-relative view matters, declare target.observer.kind: task plus target.observer.task so the probe resolves exactly as that task sees the producer from its effective backend plane.
readiness: probes: backend-ready: kind: http target: kind: task name: backend listener: backend address_view: topology observer: kind: task task: sandbox method: GET path: /healthz/readiness success: status: [200] timeout: 10000 postgres-ready: kind: tcp target: kind: service name: postgres endpoint: app timeout: 10000