Aegis Orchestrator
Reference

Agent Manifest Reference

Complete specification for the AgentManifest YAML format (v1.0) — schema, field definitions, examples, and validation configuration.

Agent Manifest Reference

API Version: 100monkeys.ai/v1 | Kind: Agent | Status: Canonical

The Agent Manifest is the source of truth for an Agent's identity, capabilities, security constraints, and execution requirements. It allows the Aegis Host to run agents safely and deterministically within the Membrane.

The manifest uses Kubernetes-style declarative format (apiVersion/kind/metadata/spec) for consistency across all AEGIS resources.


Annotated Full Example

apiVersion: 100monkeys.ai/v1      # required; must be exactly this value
kind: Agent               # required; must be exactly "Agent"

metadata:
  name: python-coder              # required; unique DNS-label name (lowercase, alphanumeric, hyphens)
  version: "1.0.0"               # required; semantic version. (name + version) must be unique.
                                 # Overwriting an existing version requires the `force` parameter in management tools.
  description: "Writes Python solutions to programming tasks."  # optional
  labels:                         # optional; key-value pairs for filtering and discovery
    role: worker
    team: platform
  annotations:                    # optional; non-identifying metadata
    maintainer: "[email protected]"

spec:
  # security_context: aegis-system-agent-runtime  # operators only

  # Runtime configuration (StandardRuntime path: language + version)
  runtime:
    language: python              # StandardRuntime: required if image not specified
    version: "3.11"              # StandardRuntime: required if image not specified
    isolation: docker             # optional; inherit | firecracker | docker | process
    # image_pull_policy: IfNotPresent  # optional for StandardRuntime; default: IfNotPresent

  # OR, for CustomRuntime (uncomment to use private/custom image):
  #
  # runtime:
  #   image: "ghcr.io/my-org/custom-agent:v1.0"  # CustomRuntime: fully-qualified image
  #   image_pull_policy: IfNotPresent             # optional; Always | IfNotPresent | Never
  #   isolation: docker                           # optional; inherit | firecracker | docker | process
  # 
  # NOTE: Mutual Exclusion — specify EITHER (language + version) OR image, NOT both.

  # Task definition (instructions for the agent)
  task:
    instruction: |               # optional; high-level task guidance
      Write a Python solution to the given problem.
      Save the solution to /workspace/solution.py.
      Run it to verify correctness and write a JSON summary to /workspace/result.json.
    prompt_template: |           # optional; Handlebars template for LLM prompts
      {{instruction}}

      User: {{input}}
      {{#if previous_error}}
      Previous attempt failed: {{previous_error}}
      {{/if}}
      Assistant:

  # Security policy (deny-by-default; declare only what is needed)
  security:
    network:
      mode: allow                 # allow | deny | none
      allowlist:
        - pypi.org
        - api.github.com
    filesystem:
      read:
        - /workspace
        - /agent
      write:
        - /workspace
    resources:
      cpu: 1000                  # millicores (1000 = 1 CPU core)
      memory: "1Gi"             # human-readable: "512Mi", "1Gi", "2Gi"
      disk: "5Gi"               # human-readable disk quota
      timeout: "300s"           # human-readable: "300s", "5m", "1h"

  # Volume mounts (ephemeral and persistent storage)
  volumes:
    - name: workspace
      type: seaweedfs          # Default backend; seaweedfs | opendal | hostPath | seal
      storage_class: ephemeral   # ephemeral | persistent
      mount_path: /workspace     # absolute path inside container
      access_mode: read-write    # read-write | read-only
      ttl_hours: 1              # required for ephemeral; hours until auto-deletion
      size_limit: "5Gi"         # maximum volume size (Kubernetes-style: "500Mi", "5Gi")

    - name: shared-data
      type: seaweedfs
      storage_class: persistent
      mount_path: /workspace/shared-data
      access_mode: read-only
      source:
        volume_id: "vol-a1b2c3d4-..."  # pin to existing volume

    - name: cloud-datasets
      type: opendal            # External Volume
      provider: s3
      config:
        bucket: my-ml-datasets
        endpoint: "https://s3.us-east-1.amazonaws.com"
      storage_class: persistent
      mount_path: /workspace/datasets
      access_mode: read-only

  # Execution strategy and validation
  execution:
    mode: iterative              # one-shot | iterative
    max_iterations: 10           # range: 1–20; default: 10
    iteration_timeout: "60s"     # per-iteration timeout (default: 300s); "30s", "60s", "5m"
    memory: false                # enable Cortex memory system
    validation:
      - type: exit_code
        expected: 0
      - type: json_schema
        schema:
          type: object
          required: ["solution_path", "output"]
          properties:
            solution_path:
              type: string
            output:
              type: string
      - type: semantic
        judge_agent: "output-judge"
        criteria: "Ensure the final output is correct and complete."
        min_score: 0.80
        min_confidence: 0.70
    tool_validation:
      - type: semantic           # Inner-loop pre-execution semantic judge for tool calls
        judge_agent: "security-judge"
        criteria: "Ensure the tool call is safe and aligns with the agent's goal."
        min_score: 0.85
        min_confidence: 0.0
        timeout_seconds: 300

  # MCP tools the agent may invoke
  tools:
    - name: filesystem
      server: "mcp:filesystem"
      config:
        allowed_paths:
          - /workspace
        access_mode: read-write
    - name: web-search
      server: "mcp:web-search"
      config:
        allowed_domains:
          - pypi.org
          - docs.python.org
        max_results_per_query: 10

  # Environment variables (static, secret references, config references)
  env:
    PYTHONUNBUFFERED: "1"
    LOG_LEVEL: "debug"
    OPENAI_API_KEY: "secret:openai-key"   # injected from secure vault

Field Reference

Top-Level Fields

FieldTypeRequiredDescription
apiVersionstringMust be 100monkeys.ai/v1.
kindstringMust be Agent.
metadataobjectManifest metadata.
specobjectAgent specification.

metadata

FieldTypeRequiredDescription
namestringUnique agent name. Pattern: ^[a-z0-9][a-z0-9-]{0,62}$.
versionstringManifest schema version. Semantic versioning (e.g., "1.0.0").
descriptionstringHuman-readable description of the agent's purpose.
labelsmap[string]stringKey-value labels for categorization and discovery. Common keys: role, category, team, environment.
annotationsmap[string]stringArbitrary non-identifying metadata (e.g., maintainer, docs).

spec

FieldTypeRequiredDescription
security_contextstringNamed security context for executions. Operators only.
output_handlerobjectEgress handler that fires after execution completes. See spec.output_handler.
runtimeobjectRuntime configuration.
taskobjectTask definition (instructions for the agent).
input_schemaJSON Schema objectDeclares named inputs accepted at execution time. See spec.input_schema.
securityobjectSecurity policy (deny-by-default).
volumesobject[]Volume mount declarations.
executionobjectExecution strategy and acceptance criteria.
contextobject[]Additional resources attached to the execution.
scheduleobjectAutomatic execution scheduling.
deliveryobjectOutput delivery destinations.
toolsobject[] or string[]MCP tools the agent may invoke.
envmap[string]stringEnvironment variables injected into the container.

spec.security_context

FieldTypeRequiredDefault
security_contextstringNo(inherits caller context)

Declares the named security context this agent's executions run under. When set, overrides the caller's execution context at runtime.

Permitted values are platform-configured security context names. On the standard platform, the only valid value is aegis-system-agent-runtime.

Only platform operators may set this field. Consumer accounts receive a 403 Forbidden error if they attempt to register an agent with security_context set.

Omitting this field (the default) means the agent runs under the calling user's or workflow's security context.

spec.output_handler

FieldTypeRequiredDefault
output_handlerobjectNo(omitted)

Declares an egress handler that fires after agent execution completes. The handler receives the agent's final output and routes or transforms it through a delivery channel. When omitted, no post-execution delivery occurs.

Failure behavior: Controlled by the required field on the handler object.

required valueHandler failure effect
trueThe execution is marked failed.
false (default)Failure is logged; execution status is unaffected.

Workflow override: When an agent is invoked from a workflow, the workflow state's output_handler takes precedence over the agent manifest's declaration.

Four handler types are supported:

type: agent

Spawns a named agent and passes the output to it as input. Use this to chain a formatter, validator, or delivery agent after execution.

spec:
  output_handler:
    type: agent
    agent_id: my-formatter-agent
    required: false
FieldTypeRequiredDescription
typestringMust be agent.
agent_idstringName of the agent to invoke.
requiredbooleanIf true, handler failure marks execution failed. Default: false.

type: webhook

POSTs the result to an external HTTP endpoint. Supports secret interpolation in headers and a Handlebars body template.

spec:
  output_handler:
    type: webhook
    url: https://hooks.example.com/results
    method: POST
    headers:
      Authorization: Bearer {{secret:my-token}}
    body_template: '{"result": "{{output}}"}'
    timeout_seconds: 30
    required: true
FieldTypeRequiredDescription
typestringMust be webhook.
urlstringTarget URL.
methodstringHTTP method. Default: POST.
headersmap[string]stringRequest headers. Supports {{secret:<name>}} interpolation.
body_templatestringHandlebars template for the request body. Variable: {{output}}.
timeout_secondsintegerRequest timeout in seconds. Default: 30.
requiredbooleanIf true, handler failure marks execution failed. Default: false.

type: mcp_tool

Invokes an MCP tool after execution — for example, sending a notification or writing to an external system.

spec:
  output_handler:
    type: mcp_tool
    tool_name: slack.send_message
    arguments:
      channel: "#results"
      text: "{{output}}"
    required: false
FieldTypeRequiredDescription
typestringMust be mcp_tool.
tool_namestringFully-qualified MCP tool name (e.g., slack.send_message).
argumentsobjectTool arguments. Supports {{output}} interpolation.
requiredbooleanIf true, handler failure marks execution failed. Default: false.

type: container

Runs a container to transform or deliver the output. The output is injected into the command via {{output}} interpolation.

spec:
  output_handler:
    type: container
    image: my-registry/formatter:latest
    command: ["/bin/format", "--input", "{{output}}"]
    required: false
FieldTypeRequiredDescription
typestringMust be container.
imagestringFully-qualified container image reference.
commandstring[]Command and arguments. Supports {{output}} interpolation.
requiredbooleanIf true, handler failure marks execution failed. Default: false.

spec.runtime

Supports two mutually exclusive runtime modes:

  • StandardRuntime: Specify language + version; orchestrator resolves to official Docker image
  • CustomRuntime: Specify image (fully-qualified reference); orchestrator pulls from registry

Mutual Exclusion: Cannot specify both paths in one manifest.

FieldTypeRequiredDefaultDescription
languagestringConditionalStandardRuntime: Programming language (python, javascript, typescript, rust, go). Required unless image is specified.
versionstringConditionalStandardRuntime: Language version (e.g., "3.11", "20"). Required if language is specified.
imagestringConditionalCustomRuntime: Fully-qualified Docker image (e.g., ghcr.io/org/agent:v1.0). Required unless both language and version are specified.
image_pull_policystringIfNotPresentImage caching strategy: Always (pull from registry every time), IfNotPresent (use cache if available, pull if missing), Never (cache only — fail if missing, no network attempt). See Container Registry & Image Management.
isolationstringinheritinherit | firecracker | docker | process
modelstringdefaultLLM model alias from aegis-config.yaml.

Mutual Exclusion Validation Rules:

  • Valid (StandardRuntime): language: "python" + version: "3.11"
  • Valid (CustomRuntime): image: "ghcr.io/org/custom:latest"
  • Invalid: Both language+version AND image specified
  • Invalid: language without version
  • Invalid: version without language
  • Invalid: image without fully-qualified format (missing /)

spec.task

FieldTypeRequiredDescription
instructionstringHigh-level guidance for the agent (multi-line YAML |). Auto-enables semantic validation.
prompt_templatestringHandlebars template for constructing LLM prompts. Default: "{{instruction}}\n\nUser: {{input}}\nAssistant:".
input_dataobjectStructured input parameters available at execution time.

Prompt template variables: {{intent}}, {{instruction}}, {{input}}, {{iteration_number}}, {{previous_error}}, {{context}}.

To import community skill definitions as agent baselines, use the built-in skill-import workflow.

spec.input_schema

An optional JSON Schema object describing the named inputs this agent accepts at execution time. Declaring input_schema lets callers know exactly what to pass and enables automatic validation of execution inputs before the agent runs.

FieldTypeRequiredDescription
input_schemaJSON Schema objectDescribes the named inputs accepted at execution time. Must be a JSON Schema object with type: object.

The following JSON Schema property types are supported. The Zaru client context panel renders each type as a specific UI element:

JSON Schema typeZaru UI control
stringText input
number / integerNumber spinner
booleanCheckbox
enumDropdown select
objectNested form group
arrayRepeatable row builder

intent and input

Every agent invocation accepts two optional fields on the execution request:

  • intent — natural-language steering. Describes what the caller wants the agent to achieve. Always available as {{intent}} in prompt templates (empty string when absent).
  • input — structured data for the execution. When input_schema is declared, input must conform to it (validated before dispatch; rejection = HTTP 422). Available as {{input}} (full JSON blob) and {{input.KEY}} (dot-notation when the value is an object).

These are complementary, not alternatives — intent steers the LLM with natural language; input supplies typed data.

intentinputRendered prompt
PresentPresent (object){{intent}} available; {{input.KEY}} dot-notation available.
PresentAbsent / empty{{intent}} available; {{input}} is empty.
AbsentPresent (object){{intent}} is empty; {{input.KEY}} dot-notation available.
AbsentAbsentBoth empty; prompt rendered from spec.task.instruction only.
AnyPresent (string){{input}} is the raw string value; no dot-notation.

Example:

spec:
  input_schema:
    type: object
    required:
      - document_url
      - output_format
    properties:
      document_url:
        type: string
        description: "URL of the document to process."
      output_format:
        type: string
        enum: ["json", "markdown", "html"]
        description: "Desired output format."
      max_pages:
        type: integer
        description: "Maximum number of pages to process. Default: all pages."
      include_metadata:
        type: boolean
        description: "Whether to include document metadata in the output. Default: false."
aegis agent run document-processor \
  --input '{"document_url": "https://example.com/report.pdf", "output_format": "json", "max_pages": 10}'

Note: spec.input_schema is a schema descriptor — it declares what inputs the agent accepts and validates them at invocation time. It is distinct from spec.task.input_data, which is a map of default runtime values injected into the container environment at startup and is not validated.

Judge agents: When an agent is used as a judge in a semantic or multi_judge validator, its input_schema property names determine how the orchestrator maps evaluation data (output content, objective, criteria, etc.) to the judge's prompt. Declaring input_schema in a judge manifest gives the judge structured access to all evaluation fields rather than receiving raw string content. See Judge Input Schema for the canonical property names.

spec.security

FieldTypeRequiredDescription
network.modeallow | deny | nonePolicy mode. allow = use allowlist; none = no network interface.
network.allowliststring[]Domain names and CIDR blocks permitted for outbound connections.
network.denyliststring[]Domain names explicitly blocked (applied after allowlist).
filesystem.readstring[]Paths inside container where reads are permitted. Glob patterns supported.
filesystem.writestring[]Paths inside container where writes are permitted. Glob patterns supported.
filesystem.read_onlybooleanSet true to make all mounts read-only.
resources.cpuintegerCPU limit in millicores (1000 = 1 core). Default: 1000.
resources.memorystringMemory limit. Human-readable: "512Mi", "1Gi". Default: "512Mi".
resources.diskstringDisk quota. Human-readable: "1Gi", "10Gi". Default: "1Gi".
resources.timeoutstringTotal execution timeout. Human-readable: "300s", "5m", "1h". Max "1h".

spec.volumes[]

FieldTypeRequiredDefaultDescription
namestringLocal identifier. Must be unique within this manifest.
typeseaweedfs | opendal | hostPath | sealseaweedfsThe storage backend mapping orchestrating this volume.
providerstringRequired for opendalThe specific cloud API (e.g. s3, gcs) or scheme.
configobjectRequired for non-seaweedfs typesConfiguration fields unique to the specific backend proxy mechanism.
storage_classephemeral | persistentLifetime of the volume.
mount_pathstringAbsolute path inside the container, rooted at /workspace (e.g., /workspace, /workspace/datasets).
access_moderead-write | read-onlyAccess mode enforced by AegisFSAL.
ttl_hoursintegerRequired for ephemeralHours until auto-deletion (e.g., 1, 24).
size_limitstringno limitMaximum volume size as a Kubernetes resource string (e.g., "500Mi", "5Gi"). Writes beyond this return ENOSPC.
source.volume_idstringRequired for persistentUUID of an existing persistent volume. Supports Handlebars: {{input.dataset_volume_id}}.

spec.execution

FieldTypeRequiredDefaultDescription
modeiterative | one-shotone-shotExecution strategy.
max_iterationsinteger10Maximum refinement loops. Range: 1–20.
iteration_timeoutstring"300s"Per-iteration timeout. Human-readable: "30s", "60s", "5m". Each iteration gets this much time for LLM calls + tool invocations.
memorybooleanfalseEnable Cortex learning memory.
llm_timeout_secondsinteger300HTTP socket timeout for bootstrap.py LLM proxy calls in seconds. Applies to network requests from agent to orchestrator.
validationValidatorSpec[]Ordered outer-loop validators executed after each iteration.
tool_validationValidatorSpec[]Ordered inner-loop tool-call validators. Current runtime consumes semantic entries for pre-execution judging; see Tool-Call Judging.

spec.execution.validation

validation is an ordered list of ValidatorSpec entries executed after each iteration. Each variant has its own fields:

exit_code

FieldTypeDefaultDescription
expectedinteger0Required exit code.
min_scorefloat1.0Minimum score to accept this step.

json_schema

FieldTypeDefaultDescription
schemaJSON Schema objectSchema applied to the output payload.
min_scorefloat1.0Minimum score to accept this step.

regex

FieldTypeDefaultDescription
patternstringRegular expression to match against the selected output stream.
targetstringstdoutOutput stream to inspect: stdout or stderr.
min_scorefloat1.0Minimum score to accept this step.

semantic

FieldTypeDefaultDescription
judge_agentstringDeployed judge agent to spawn as a child execution.
criteriastringHuman-readable rubric passed to the judge.
min_scorefloat0.7Minimum score required to accept the output.
min_confidencefloat0.0Minimum self-reported confidence required.
timeout_secondsinteger300Maximum wait time for the judge child execution in seconds.

multi_judge

FieldTypeDefaultDescription
judgesstring[]Deployed judge agents to spawn in parallel.
consensusenumweighted_averageAggregation strategy used to combine judge results.
min_judges_requiredinteger1Minimum number of judges that must complete.
criteriastringHuman-readable rubric passed to each judge.
min_scorefloat0.7Minimum score required to accept the output.
min_confidencefloat0.0Minimum self-reported confidence required.
timeout_secondsinteger300Maximum wait time for all judge child executions in seconds.

tool_validation is an ordered list of validator specifications. In the current runtime, only semantic entries are used for tool-call gating; they run before dispatch and can block the call synchronously. See Tool-Call Judging for the inner-loop contract.

FieldTypeRequiredDefaultDescription
typesemanticValidator type used by the inner-loop tool gate.
judge_agentstringDeployed judge agent to spawn as a child execution.
criteriastringHuman-readable rubric passed to the judge.
min_scorefloat0.7Minimum score required to allow the tool call.
min_confidencefloat0.0Minimum self-reported confidence required to allow the tool call.
timeout_secondsinteger300Maximum wait time for the judge child execution in seconds.

Tool-call judging is conjunctive: the orchestrator allows the tool only when both the score and confidence thresholds are satisfied.

If a non-semantic validator is present in tool_validation, it is ignored by the current inner-loop tool gate and should be documented as an outer-loop validator instead.

spec.tools[]

Tools can be declared in simple (string) or detailed (object) format.

Simple format:

tools:
  - "mcp:filesystem"
  - "mcp:web-search"
  - "mcp:gmail"

Detailed format:

FieldTypeRequiredDescription
namestringLocal name for this tool binding.
serverstringMCP server identifier (e.g., "mcp:filesystem").
configobjectTool-specific configuration.

Filesystem tool config (mcp:filesystem):

FieldTypeDefaultDescription
allowed_pathsstring[]["/workspace"]Permitted directory prefixes.
access_moderead-only | read-writeread-onlyWrite access mode.
max_file_size_bytesinteger10485760Maximum file size per operation.

Web search tool config (mcp:web-search):

FieldTypeDefaultDescription
allowed_domainsstring[][] (all)Restrict search results to these domains.
max_results_per_queryinteger10Maximum results returned per search.
max_calls_per_executioninteger50Rate limit for search invocations.

Gmail tool config (mcp:gmail):

FieldTypeDefaultDescription
allowed_operationsstring[]["read","search"]Permitted operations: read, search, send.
max_messages_per_queryinteger50Maximum messages returned per query.
max_calls_per_executioninteger30Rate limit for Gmail API calls.
allowed_labelsstring[][] (all)Restrict access to these Gmail labels.

Builtin command executor (builtin:cmd):

cmd.run is not an MCP server. Use executor: "builtin:cmd" to enable in-container subprocess execution via the Dispatch Protocol. The server field is omitted; only name, executor, and config are used.

tools:
  - name: cmd
    executor: "builtin:cmd"
    config:
      subcommand_allowlist:
        git: [clone, add, commit, push, pull, status, diff]
        cargo: [build, test, fmt, clippy, check, run]
        npm: [install, run, test, build, ci]
        python: ["-m"]
      env_var_denylist:
        - OPENAI_API_KEY
      timeout_ceiling_secs: 120
      max_output_bytes: 524288
FieldTypeDefaultDescription
subcommand_allowlistobjectRequired. Map of command → [allowed_first_args]. Any cmd.run call with a command or first argument not in this map is rejected with a policy violation.
env_var_denyliststring[][]Additional env vars stripped before forwarding to the subprocess. Appended to the node-level global_env_denylist.
timeout_ceiling_secsinteger300Maximum per-subprocess timeout this agent may request. Cannot exceed builtin_dispatchers.cmd.max_timeout_ceiling_secs in the node config.
max_output_bytesinteger524288Maximum captured stdout+stderr per subprocess in bytes. Truncated output is signaled to the LLM with a notice.

builtin:cmd is the only valid non-mcp: executor value. The subcommand_allowlist is the only required config field; all other fields inherit node-level defaults when omitted.

spec.env

Environment variable values support three formats:

FormatExampleDescription
Static string"production"Literal value.
Secret reference"secret:openai-key"Injected from secure vault (OpenBao). Never logged.
Config reference"config:log_level"From configuration store.

Field Definitions

spec.runtime

Defines the execution environment for the agent.

StandardRuntime (Official Images)

When you specify language and version, the orchestrator resolves them to a deterministic official Docker image by consulting the StandardRuntime Registry committed to the repository. The registry enforces:

  1. No "latest" tags: Every language+version maps to an exact, immutable image tag
  2. Production-grade variants: -slim for Debian-based, -alpine for Alpine (minimal image size)
  3. Validation at execution time: Unsupported language/version combinations fail fast with clear error messages
LanguageVersionDocker ImageStatus
python3.11python:3.11-slimSupported
python3.10python:3.10-slimSupported
python3.9python:3.9-slimDeprecated — migrate to 3.10 or 3.11
javascript20node:20-alpineSupported (Current LTS)
javascript18node:18-alpineSupported (Legacy LTS)
typescript5.1node:20-alpineSupported
rust1.75rust:1.75-alpineSupported
rust1.74rust:1.74-alpineSupported (Legacy)
go1.21golang:1.21-alpineSupported
go1.20golang:1.20-alpineSupported (Legacy)

For the full matrix including pre-installed toolchains, deprecation lifecycle, and pre-caching guidance, see Standard Runtime Registry.

Where the mapping lives: The canonical registry is defined in runtime-registry.yaml in the orchestrator repository. This file is committed and loaded at daemon startup.

Benefits:

  • Deterministic: Same language+version always pulls the same vetted image
  • Managed by AEGIS: No arbitrary image choices; all runtimes are vetted for security and compatibility
  • Fast: Images are cached; only pulled once
  • Validated: Manifests with unsupported language/version combinations are rejected at execution validation time

CustomRuntime (User-Supplied Images)

When you specify image, the orchestrator pulls from your container registry. Supports fully-qualified references like ghcr.io/org/custom:v1.0.

Benefits: Flexibility for OS packages, custom dependencies, private registries. See Custom Runtime Agents for a complete walkthrough.

Bootstrap Script

Both paths use the orchestrator-provided bootstrap script for 100monkeys iteration control. To bundle your own bootstrap script in a CustomRuntime image, declare its path in spec.advanced.bootstrap_path. See Custom Runtime Agents — Bootstrap Handling.

Isolation Mode

isolation specifies process isolation technology:

  • inherit (default): Use node configuration setting
  • docker: Run in a container managed by the configured container runtime (Phase 1)
  • firecracker: Firecracker microVM (Phase 2)
  • process: Host process (Phase 2)

The docker isolation type uses the container runtime configured via container_socket_path in the node configuration. This works with both Docker and Podman — the runtime is auto-detected.

spec.task

task.instruction

High-level steering instructions for the agent. When present, the instruction text can be reused as the criteria for a semantic validator in the ordered spec.execution.validation list.

task.prompt_template

Handlebars template that controls how the LLM prompt is assembled.

Available variables:

VariableDescription
{{intent}}Natural-language steering from the caller's execution request. Empty string when absent.
{{instruction}}The task.instruction text.
{{input}}JSON input supplied at execution time.
{{iteration_number}}Current iteration (1-based).
{{previous_error}}Validator failure output from the previous iteration; empty on iteration 1.
{{context}}Concatenated context attachments.

spec.security

All permissions are deny-by-default. When spec.security is omitted entirely, the agent runs with no network or filesystem access and default resource limits.

Network Policy

mode: none attaches no network interface to the container — the strongest isolation. mode: allow with an allowlist permits only the listed domains. mode: deny with a denylist blocks specific domains and allows all others.

Filesystem Policy

Paths support glob patterns (e.g., "/config/*.yaml"). filesystem.read_only: true overrides all other settings and makes every mount read-only regardless of access_mode.

Resource Limits — Timeout Hierarchy

The timeout field is the outer bound for the entire execution. All validation sub-timeouts must fit within this budget:

security.resources.timeout                                    (e.g., 600s)
  └─ execution.validation[*].timeout_seconds         (e.g., 90s)  — outer-loop validator budgets
  └─ execution.tool_validation[*].timeout_seconds    (e.g., 30s)  — inner-loop tool-call judge

spec.volumes[]

Storage volumes are mounted into the agent container via the orchestrator's AegisFSAL layer. Volume mount declarations are transport-agnostic -- the runtime automatically selects the appropriate transport (NFS for rootful Docker, FUSE bind mounts for rootless Podman, virtio-fs for Firecracker) based on the node configuration. The agent manifest does not need to specify which transport to use; the orchestrator intercepts all file operations regardless of transport, enforcing policy and maintaining a full audit trail.

ClassTTLUse Case
ephemeralRequired; auto-cleanupScratch space, build artifacts
persistentNone; explicit deleteShared datasets, long-lived output

access_mode: read-write is exclusive — only one execution may hold write access to a persistent volume at a time. read-only volumes may be read by multiple agents simultaneously.

All spec.volumes[].mount_path values must be rooted at /workspace.

spec.execution

Controls the 100monkeys iterative execution strategy. When mode: iterative, if any validation check fails the orchestrator sends the failure output back to the agent as {{previous_error}} and starts a new iteration (up to max_iterations). When mode: one-shot, the first run is final.

Setting memory: true enables the Cortex learning system to index refinements from this execution, improving suggestions for future runs.

validation and tool_validation

validation and tool_validation are both ordered lists of ValidatorSpec entries. Outer-loop validators run after each iteration; inner-loop tool-call validators run before dispatch.

Common validator variants:

  • exit_code - checks the process exit code.
  • json_schema - validates the agent output against a JSON Schema document.
  • regex - validates output text with a regular expression.
  • semantic - spawns a judge agent as a child execution and compares the returned score and confidence against thresholds.
  • multi_judge - runs multiple judge agents and aggregates their results.

For tool-call judging specifics, including the judge payload and bypass semantics, see Tool-Call Judging.

spec.tools[]

The orchestrator mediates all tool calls — agents never access MCP servers or credentials directly.

Security model:

LayerWhat is enforced
AuthenticationOrchestrator validates execution_id before forwarding any call.
AuthorizationTool name must appear in spec.tools. Absent = rejected.
Policy validationArguments validated against tool-specific config (paths, domains, operations).
Rate limitingCall count tracked per execution; calls beyond limit return 429.
Credential isolationOAuth tokens and API keys held by orchestrator; never exposed to agent.
Audit trailEvery invocation published as MCPToolEvent domain event.

spec.context[]

Additional resources attached to the agent at execution time.

FieldTypeRequiredDescription
typetext | file | directory | urlResource type.
contentstringRequired for textInline text content.
pathstringRequired for file, directoryFile or directory path.
urlstringRequired for urlURL to fetch.
descriptionstringHuman-readable description.

spec.schedule

FieldTypeDescription
typecron | interval | manualSchedule type.
cronstringStandard cron expression (e.g., "0 * * * *" = hourly).
timezonestringIANA timezone (e.g., "America/New_York").
enabledbooleanWhether the schedule is active.

spec.advanced

FieldTypeDescription
bootstrap_pathstringPath to custom bootstrap script inside the container. Only valid for CustomRuntime. The field is declared in the schema and exposed in the Python and TypeScript SDKs; runtime wiring is tracked in the orchestrator backlog and is not yet active in the current release.
warm_pool_sizeintegerNumber of pre-warmed container instances.
swarm_enabledbooleanEnable multi-agent coordination.
startup_scriptstringCustom startup script.

spec.delivery

Output delivery destinations evaluated after execution completes. Each destination:

FieldTypeRequiredDescription
namestringUnique identifier for this destination.
conditionon_success | on_failure | alwaysWhen to deliver.
transformobjectETL script applied to output before delivery.
typeemail | webhook | rest | smsDelivery mechanism.

Transform fields:

FieldTypeDescription
scriptstringPath to transformation script. Receives agent output on stdin.
argsstring[]Additional CLI arguments for the script.
timeout_secondsintegerScript timeout. Default: 30.

Email delivery fields: email.to, email.subject (supports {{date}}, {{agent.name}}), email.body_template, email.attachments.

Webhook delivery fields: webhook.url, webhook.method (default POST), webhook.headers (supports {{secret:token-name}}).


Examples

Minimal Agent (StandardRuntime)

apiVersion: 100monkeys.ai/v1
kind: Agent

metadata:
  name: pr-reviewer
  version: "1.0.0"
  description: "Reviews pull request diffs and returns structured feedback."

spec:
  runtime:
    language: python
    version: "3.11"

  task:
    instruction: |
      Review the provided code diff and return structured feedback covering:
      - Security vulnerabilities or concerns
      - Performance issues or opportunities
      - Code quality and maintainability
      Provide specific line references where relevant.

  security:
    network:
      mode: allow
      allowlist:
        - api.github.com
    resources:
      cpu: 1000
      memory: "1Gi"
      timeout: "300s"

  execution:
    mode: iterative
    max_iterations: 5

Minimal Agent (CustomRuntime)

apiVersion: 100monkeys.ai/v1
kind: Agent

metadata:
  name: diagram-generator
  version: "1.0.0"
  description: "Generates diagrams using Graphviz (requires custom image)."

spec:
  runtime:
    image: "ghcr.io/my-org/diagram-generator:latest"
    image_pull_policy: IfNotPresent  # Use cache if available

  task:
    instruction: |
      Generate a Graphviz diagram from the user's description.
      Save the DOT file to /workspace/diagram.dot
      Render to SVG: dot -Tsvg diagram.dot -o diagram.svg

  security:
    filesystem:
      read:
        - /workspace
      write:
        - /workspace
    resources:
      cpu: 1000
      memory: "1Gi"
      timeout: "120s"

  volumes:
    - name: workspace
      storage_class: ephemeral
      mount_path: /workspace
      access_mode: read-write
      ttl_hours: 1
      size_limit: "1Gi"

  execution:
    mode: iterative
    max_iterations: 3

Agent with JSON Output Validation

apiVersion: 100monkeys.ai/v1
kind: Agent

metadata:
  name: data-extractor
  version: "1.0.0"

spec:
  runtime:
    language: python
    version: "3.11"

  task:
    instruction: |
      Extract structured data from the provided document and output valid JSON
      matching the required schema.

  security:
    filesystem:
      read:
        - /workspace
      write:
        - /workspace/output
    resources:
      cpu: 500
      memory: "512Mi"
      timeout: "120s"

  volumes:
    - name: workspace
      storage_class: ephemeral
      mount_path: /workspace
      access_mode: read-write
      ttl_hours: 1
      size_limit: "500Mi"

  execution:
    mode: iterative
    max_iterations: 8
    validation:
      system:
        must_succeed: true
        allow_stderr: false
        timeout_seconds: 60
      output:
        format: json
        schema:
          type: object
          required: ["entities", "relationships", "confidence"]
          properties:
            entities:
              type: array
            relationships:
              type: array
            confidence:
              type: number
              minimum: 0
              maximum: 1
      semantic:
        threshold: 0.85
        fallback_on_unavailable: skip

  tools:
    - name: filesystem
      server: "mcp:filesystem"
      config:
        allowed_paths: ["/workspace"]
        access_mode: read-write

Code Reviewer (Judge Agent)

Judge agents evaluate the output of other agents and must return structured JSON with gradient scoring:

{
  "score": 0.85,
  "confidence": 0.92,
  "reasoning": "The code correctly implements the requirements with minor style issues.",
  "suggestions": ["Add type hints", "Improve error handling"],
  "verdict": "pass"
}

Required judge output fields:

FieldTypeDescription
scorefloat (0.0–1.0)Quality/correctness score on a continuous gradient.
confidencefloat (0.0–1.0)Judge's certainty in its assessment.
reasoningstringExplanation for the score.

Optional fields: signals, suggestions, verdict, and any custom metadata.

apiVersion: 100monkeys.ai/v1
kind: Agent

metadata:
  name: code-quality-judge
  version: "1.0.0"
  labels:
    role: judge
    domain: code-review

spec:
  runtime:
    language: python
    version: "3.11"

  task:
    instruction: |
      You are a code quality judge. Evaluate the provided code output on:
      1. Correctness: Does it solve the stated problem?
      2. Code quality: Is it idiomatic and well-structured?
      3. Error handling: Does it handle edge cases?

      Always respond with valid JSON:
      {
        "score": <0.0-1.0>,
        "confidence": <0.0-1.0>,
        "reasoning": "<explanation>",
        "suggestions": ["<improvement>"],
        "verdict": "pass|fail|warning"
      }

  security:
    network:
      mode: none
    resources:
      cpu: 500
      memory: "512Mi"
      timeout: "60s"

  execution:
    mode: one-shot
    validation:
      system:
        must_succeed: true
      output:
        format: json
        schema:
          type: object
          required: ["score", "confidence", "reasoning"]
          properties:
            score:
              type: number
              minimum: 0
              maximum: 1
            confidence:
              type: number
              minimum: 0
              maximum: 1
            reasoning:
              type: string

Multi-Judge Consensus

For validation requiring multiple independent judges with consensus aggregation, use ValidatorSpec::MultiJudge inside spec.execution.validation. For workflow-level fan-out and consensus, use workflow ParallelAgents states. See:

Featurespec.execution.validation.multi_judgeWorkflow ParallelAgents
Number of judgesSingleMultiple in parallel
Consensus algorithmN/Amean, min, max, majority
Configuration locationAgent manifestWorkflow manifest
Use casePer-iteration quality gateFinal output multi-panel review

Runtime Selection Guide

When to Use StandardRuntime

Choose StandardRuntime if:

  • Your agent only needs Python, Node.js, TypeScript, Rust, or Go
  • Standard system packages are sufficient (no OS binaries required)
  • You want AEGIS to manage image updates
  • You want deterministic, cached deployments

When to Use CustomRuntime

Choose CustomRuntime if:

  • You need OS-level packages (graphviz, imagemagick, system libraries)
  • You require a specialized language version not in StandardRuntime registry
  • You want to include custom dependencies in the image
  • You're using a private container registry
  • You need image signing or vulnerability scanning

See: Custom Runtime Agents for complete walkthrough.


Image Pull Policies

IfNotPresent (Default)

Use local cached image if available; pull from registry only if missing.

Best for: Release versions (stable, changes rarely), reducing network traffic, offline deployments.

Always

Always pull from registry, even if already cached locally.

Best for: -dev, -latest tags (frequently updated), ensuring agents use current fixes.

Never

Only use cached images; fail if image is not already cached.

Best for: Locked production versions, guaranteed deterministic content, airgapped deployments. Images must be pre-loaded on the node via docker pull before execution. See Pre-Caching for Airgapped Environments.


Version History

VersionDateNotes
v1.02026-02-16Kubernetes-style format with apiVersion/kind/metadata/spec. Canonical format for all new agents.

On this page