RUNTIME CONTAINER GUIDE
Status: Design/docs only. This document does not change compiler or runtime semantics. It describes how to package and run AINL as a containerized runtime unit inside any orchestrator or sandbox controller.
Runtime Container Guide
Status: Design/docs only. This document does not change compiler or runtime semantics. It describes how to package and run AINL as a containerized runtime unit inside any orchestrator or sandbox controller.
1. Purpose
This guide explains how to deploy the AINL runtime as a generic, containerized execution unit that any orchestrator, sandbox controller, or managed platform can launch, configure, monitor, and stop.
It is framework-agnostic. The patterns described here work with any container orchestrator (Docker, Kubernetes, Podman, etc.) and any agent host (NemoClaw, OpenShell, custom orchestrators, CI/CD systems, etc.).
2. Two deployment modes
2.1 Runner service mode (HTTP API)
Run the AINL runtime as a persistent HTTP service that accepts workflow execution requests via REST.
- Entrypoint:
ainl-runner-service(oruvicorn scripts.runtime_runner_service:app) - Default port: 8770
- Endpoints:
POST /run— compile and execute an AINL program (optionalpolicyfor pre-execution validation)POST /enqueue— queue a program for background executionGET /result/{id}— retrieve a queued job resultGET /capabilities— discover adapters, verbs, tiers, and runtime versionGET /health— liveness probeGET /ready— readiness probeGET /metrics— runtime metrics
This mode is best when:
- the orchestrator sends workflows via HTTP,
- multiple workflows may execute over the service's lifetime,
- health/readiness monitoring is needed.
2.2 CLI mode (one-shot execution)
Run a single AINL program via the ainl CLI and exit.
- Entrypoint:
ainl runprogram[options] - Exit code: 0 on success, non-zero on failure
This mode is best when:
- the orchestrator launches a container per workflow,
- execution is one-shot (run once, return result, exit),
- no persistent HTTP service is needed.
3. Minimal Dockerfile for the runner service
FROM python:3.11-slim
WORKDIR /app
COPY pyproject.toml requirements-dev.txt ./
COPY compiler_v2.py compiler_grammar.py grammar_priors.py grammar_constraint.py runtime.py ./
COPY runtime/ ./runtime/
COPY adapters/ ./adapters/
COPY tooling/ ./tooling/
COPY scripts/ ./scripts/
COPY cli/ ./cli/
RUN pip install --no-cache-dir -e ".[web]"
EXPOSE 8770
ENV PYTHONUNBUFFERED=1
ENV AINL_AGENT_ROOT=/data/agents
ENV AINL_MEMORY_DB=/data/memory.sqlite3
HEALTHCHECK --interval=30s --timeout=5s \
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8770/health')" || exit 1
CMD ["uvicorn", "scripts.runtime_runner_service:app", "--host", "0.0.0.0", "--port", "8770"]
3.1 Minimal Dockerfile for CLI mode
FROM python:3.11-slim
WORKDIR /app
COPY pyproject.toml requirements-dev.txt ./
COPY compiler_v2.py compiler_grammar.py grammar_priors.py grammar_constraint.py runtime.py ./
COPY runtime/ ./runtime/
COPY adapters/ ./adapters/
COPY tooling/ ./tooling/
COPY scripts/ ./scripts/
COPY cli/ ./cli/
RUN pip install --no-cache-dir -e .
ENTRYPOINT ["ainl", "run"]
Usage: docker run ainl-cli program.ainl --max-steps 5000 --strict
4. Configuration via environment and request
4.1 Environment variables
| Variable | Purpose | Default |
|----------|---------|---------|
| AINL_AGENT_ROOT | Sandbox root for agent coordination | /tmp/ainl_agents |
| AINL_MEMORY_DB | Path to SQLite memory store | /tmp/ainl_memory.sqlite3 |
| AINL_SUMMARY_ROOT | Root for metrics output | (adapter-specific) |
In a containerized deployment, point these at container-local paths (e.g.
/data/agents, /data/memory.sqlite3). If using ephemeral containers, these
can be temporary.
4.2 Runtime configuration via /run request
The runner service accepts configuration in the POST body:
{
"code": "S app api /api\nL1:\nR core.ADD 2 3 ->x\nJ x",
"strict": true,
"label": "L1",
"limits": {
"max_steps": 5000,
"max_depth": 50,
"max_adapter_calls": 500,
"max_time_ms": 30000
},
"allowed_adapters": ["core"],
"adapters": {
"enable": ["http"],
"http": {
"allow_hosts": ["api.internal.example.com"],
"timeout_s": 5.0
}
}
}
Key configuration fields:
allowed_adapters— adapter allowlist (capability gating)limits— runtime resource limitspolicy— optional policy object for pre-execution IR validation (see below)adapters.enable/adapters.<name>— adapter-specific configurationstrict— strict-mode compilationrecord_calls/replay_log— deterministic recording/replay
5. Health and readiness probes
The runner service exposes standard probe endpoints:
| Endpoint | Purpose | Healthy response |
|----------|---------|-----------------|
| GET /health | Liveness probe | {"status": "ok"} (200) |
| GET /ready | Readiness probe | {"status": "ready"} (200) |
| GET /metrics | Runtime metrics | JSON with run counts, durations, adapter stats |
5.1 Kubernetes probe configuration
livenessProbe:
httpGet:
path: /health
port: 8770
initialDelaySeconds: 5
periodSeconds: 30
readinessProbe:
httpGet:
path: /ready
port: 8770
initialDelaySeconds: 3
periodSeconds: 10
5.2 Docker Compose health check
services:
ainl-runner:
build: .
ports:
- "8770:8770"
healthcheck:
test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8770/health')"]
interval: 30s
timeout: 5s
retries: 3
6. Graceful shutdown
The runner service uses uvicorn, which handles SIGTERM for graceful shutdown.
Container orchestrators that send SIGTERM before SIGKILL (standard
Kubernetes behavior with a default 30-second grace period) will get clean
shutdown.
For long-running enqueued jobs, the orchestrator should:
- check
/result/{id}before stopping the container, - or accept that in-flight jobs may be lost on container stop.
7. Resource constraints
7.1 Container-level constraints (orchestrator-enforced)
Set these in your container orchestrator:
- CPU limit: 0.5–2 cores (depends on workflow complexity)
- Memory limit: 256MB–1GB (depends on workflow data size)
- Wall-clock timeout: match your SLA (the runner service does not enforce a global timeout; use container-level timeouts as the outer boundary)
7.2 AINL-level constraints (runtime-enforced)
Set these in the /run request or CLI flags:
max_steps,max_depth,max_adapter_calls,max_time_ms,max_frame_bytes,max_loop_iters
See docs/operations/SANDBOX_EXECUTION_PROFILE.md for recommended limit
profiles.
These two layers provide defense-in-depth: AINL limits catch runaway workflows within the runtime, and container limits catch anything that escapes the runtime.
8. Sandboxed execution checklist
Before deploying AINL in a sandboxed or restricted environment:
- [ ] Choose an adapter allowlist profile from
docs/operations/SANDBOX_EXECUTION_PROFILE.md - [ ] Set runtime limits appropriate for your workload
- [ ] Configure environment variables to use container-local paths
- [ ] Set container-level resource constraints (CPU, memory, timeout)
- [ ] If using
httpadapter, configureallow_hoststo restrict egress - [ ] If using
fsadapter, configuresandbox_rootto restrict filesystem access - [ ] If the orchestrator needs pre-execution policy checks, include a
policyobject in the/runrequest (the runner validates IR against the policy before execution and returns HTTP 403 on violations) - [ ] Verify health/readiness probes work in your orchestrator
- [ ] Review
docs/advanced/SAFE_USE_AND_THREAT_MODEL.mdfor trust model
9. Integration patterns
9.1 Orchestrator submits pre-compiled IR
If the orchestrator pre-compiles AINL to IR (using the compiler separately), it can submit IR directly to the runner service. This separates the compile step (which can be done in a trusted environment) from the execution step (which happens in the sandbox).
9.2 Orchestrator submits source code
The runner service can compile and execute in one step via POST /run with
code in the body. The runner compiles to IR, caches the result, and
executes.
9.3 One-shot container execution
For orchestrators that launch a container per workflow:
docker run --rm \
-e AINL_AGENT_ROOT=/data/agents \
ainl-cli program.ainl \
--strict \
--max-steps 5000 \
--enable-adapter core \
--json
The container runs the workflow, prints JSON output, and exits.
9.4 Sidecar pattern
The AINL runner can run as a sidecar container alongside an agent process.
The agent sends workflows to localhost:8770/run and receives results.
The orchestrator manages both containers as a pod.
10. What AINL does not provide in containerized deployments
The AINL runtime is the workflow execution layer. It does not provide:
- container orchestration or scheduling,
- secret management (do not pass secrets in AINL source or adapter args),
- log aggregation (the runner emits structured JSON logs to stdout; pipe to your logging system),
- TLS termination (place behind a reverse proxy or ingress controller),
- multi-tenant isolation (one runtime instance per tenant, or enforce at the orchestrator level).
11. Relationship to other docs
- External orchestration guide:
docs/operations/EXTERNAL_ORCHESTRATION_GUIDE.md - Sandbox profiles and adapter allowlists:
docs/operations/SANDBOX_EXECUTION_PROFILE.md - Trust model and safe use:
docs/advanced/SAFE_USE_AND_THREAT_MODEL.md - Agent coordination:
docs/advanced/AGENT_COORDINATION_CONTRACT.md - Adapter registry:
docs/reference/ADAPTER_REGISTRY.md - Runner service source:
scripts/runtime_runner_service.py - MCP server (for MCP-compatible agent hosts):
scripts/ainl_mcp_server.py(seedocs/operations/EXTERNAL_ORCHESTRATION_GUIDE.md, section 9). The MCP server is a thin, stdio-only integration surface over the same compiler/runtime described here, not a replacement for the runner service or a standalone agent platform. - Emitted server Dockerfile (application mode):
tests/emits/server/Dockerfile
