Aegis Orchestrator
Reference

Workflow Manifest Reference

Complete specification for the WorkflowManifest YAML format — schema, field definitions, all seven state kinds including ContainerRun for CI/CD steps, Subworkflow for workflow composition, consensus strategies, and complete examples.

Workflow Manifest Reference

API Version: 100monkeys.ai/v1 | Kind: Workflow | Status: Canonical

A Workflow Manifest is a declarative YAML document defining a finite state machine (FSM) that coordinates agent executions, container-based CI/CD steps, system commands, and human approval gates into a durable, restartable sequence. It uses the same Kubernetes-style apiVersion/kind/metadata/spec format as the Agent Manifest.


Annotated Full Example

apiVersion: 100monkeys.ai/v1      # required; must be exactly this value
kind: Workflow                     # required; must be exactly "Workflow"

metadata:
  name: dev-pipeline               # required; unique DNS-label name (lowercase, alphanumeric, hyphens)
  version: "1.0.0"                 # required; semantic version. (name + version) must be unique.
                                   # Overwriting an existing version requires the `force` parameter in management tools.
  description: "Iterative code generation with multi-judge validation."  # optional
  labels:                          # optional; key-value pairs for filtering and discovery
    category: development
    pattern: iterative-refinement
  annotations:                     # optional; non-identifying metadata
    docs: "https://docs.example.com/workflows/dev-pipeline"

spec:
  initial_state: GENERATE          # required; name of the first state to enter

  # Optional: global constants available to all states via {{workflow.context.KEY}}
  context:
    max_iterations: 10
    validation_threshold: 0.95

  # Optional: workflow-level shared storage (volumes accessible by any state)
  storage:
    workspace:                     # default shared workspace; auto-created at workflow start
      storage_class: ephemeral     # ephemeral | persistent
      ttl_hours: 1
      size_limit_mb: 1000
    shared_volumes:
      - name: code-artifacts
        storage_class: ephemeral
        ttl_hours: 24
        size_limit_mb: 500

  # Required: state machine definition (map of state name → state definition)
  states:

    GENERATE:
      kind: Agent                  # StateKind: Agent | System | Human | ParallelAgents | ContainerRun | ParallelContainerRun | Subworkflow
      agent: coder-v1              # required for Agent kind; deployed agent name or UUID
      input: |                     # Handlebars template rendered as the agent's task input
        {{#if blackboard.iteration_number}}
        Iteration {{blackboard.iteration_number}} failed.
        Error: {{state.feedback}}
        Fix the code.
        {{else}}
        {{intent}}
        {{/if}}
      volumes:                     # optional volume mounts for this state
        - volume: "{{workflow.storage.workspace}}"
          mount_path: /workspace
          access_mode: read-write
      transitions:
        - condition: always        # unconditional; first matching transition wins
          target: EXECUTE

    EXECUTE:
      kind: System
      command: python main.py      # required for System kind; shell command (Handlebars supported)
      env:
        CODE: "{{GENERATE.output}}"
        PYTHONPATH: /workspace
      transitions:
        - condition: exit_code_zero
          target: VALIDATE
        - condition: exit_code_non_zero
          target: REFINE
          feedback: "Execution failed: {{EXECUTE.output.stderr}}"

    VALIDATE:
      kind: Agent
      agent: judge-v1
      input: |
        Task: {{intent}}
        Output: {{EXECUTE.output.stdout}}
        Evaluate correctness and return JSON with score, confidence and reasoning.
      transitions:
        - condition: score_above
          threshold: 0.95
          target: COMPLETE
        - condition: score_below
          threshold: 0.95
          target: REFINE
          feedback: "{{VALIDATE.output.reasoning}}"

    AWAIT_APPROVAL:
      kind: Human
      prompt: |                    # required for Human kind; message shown to operator
        Review the generated output and approve or reject.
        Output: {{EXECUTE.output.stdout}}
      timeout: 3600s               # optional; omit to wait indefinitely
      default_response: reject     # optional; value taken if timeout elapses without signal
      transitions:
        - condition: input_equals_yes
          target: COMPLETE
        - condition: input_equals_no
          target: REFINE
          feedback: "{{human.feedback}}"

    AUDIT:
      kind: ParallelAgents
      agents:
        - agent: reviewer-v1
          input: "Review code quality: {{GENERATE.output}}"
          weight: 1.0              # optional; weight for consensus calculation (default 1.0)
          timeout_seconds: 120     # optional; per-agent timeout (default 60)
          poll_interval_ms: 500    # optional; polling interval (default 500)
        - agent: security-v1
          input: "Security audit: {{GENERATE.output}}"
          weight: 2.0              # security weighted higher
          timeout_seconds: 180
      consensus:
        strategy: weighted_average # weighted_average | majority | unanimous | best_of_n
        threshold: 0.85
        min_agreement_confidence: 0.75
        min_judges_required: 2     # optional; minimum judges that must complete (default 1)
        confidence_weighting:
          agreement_factor: 0.7    # weight for inter-judge agreement; sums to 1.0 with self_confidence_factor
          self_confidence_factor: 0.3
      transitions:
        - condition: consensus
          threshold: 0.85
          agreement: 0.75
          target: COMPLETE
        - condition: score_below
          threshold: 0.85
          target: REFINE
          feedback: "Audit failed (score: {{AUDIT.consensus.score}}): {{AUDIT.agents.0.output}}"

    REFINE:
      kind: System
      command: update_context
      env:
        ITERATION: "{{blackboard.iteration_number + 1}}"
        ERRORS: "{{state.feedback}}"
      transitions:
        - condition: custom
          expression: "{{blackboard.iteration_number < workflow.context.max_iterations}}"
          target: GENERATE
        - condition: custom
          expression: "{{blackboard.iteration_number >= workflow.context.max_iterations}}"
          target: FAILED

    COMPLETE:
      kind: System
      command: finalize
      transitions: []              # empty transitions list = terminal state

    FAILED:
      kind: System
      command: "echo 'Max iterations reached'"
      transitions: []              # terminal state

Field Reference

Top-Level Fields

FieldTypeRequiredDescription
apiVersionstringMust be 100monkeys.ai/v1.
kindstringMust be Workflow.
metadataobjectManifest metadata.
specobjectWorkflow specification.

metadata

FieldTypeRequiredDescription
namestringUnique workflow name. Pattern: ^[a-z0-9][a-z0-9-]{0,62}$.
versionstringManifest semantic version (e.g., "1.0.0").
descriptionstringHuman-readable purpose description.
labelsmap[string]stringKey-value labels for categorization and discovery.
annotationsmap[string]stringArbitrary non-identifying metadata (URLs, changelogs, JSON blobs).
input_schemaJSON Schema objectDeclares named inputs accepted at execution time. See metadata.input_schema.

metadata.input_schema

An optional JSON Schema object describing the named inputs this workflow accepts at execution time. Callers pass a JSON object matching this schema via --input (or the equivalent API field); the orchestrator validates the payload before the workflow starts.

FieldTypeRequiredDescription
input_schemaJSON Schema objectMust be a JSON Schema object with type: object.

Example:

metadata:
  name: data-pipeline
  version: "1.0.0"
  input_schema:
    type: object
    properties:
      dataset_path:
        type: string
        description: Path to the input dataset
      environment:
        type: string
        enum: [staging, production]
    required:
      - dataset_path
      - environment

The following JSON Schema property types are supported. The Zaru client context panel renders each type as a specific UI element:

JSON Schema typeZaru UI control
stringText input
number / integerNumber spinner
booleanCheckbox
enumDropdown select
objectNested form group
arrayRepeatable row builder

When input_schema is present, the orchestrator validates the caller's input payload before creating the WorkflowExecution. A payload that fails schema validation is rejected with HTTP 422; no workflow execution is created. When input_schema is absent, any payload is accepted.

The validated payload is stored as WorkflowExecution.input and is immutable for the lifetime of the execution. It is exposed to all workflow states via the {{input.KEY}} Handlebars namespace — not just the first state.

input_schema vs spec.context: spec.context holds author-time constants declared in the manifest, accessible via {{workflow.context.KEY}}. metadata.input_schema declares caller-supplied runtime values, accessible via {{input.KEY}}. These are separate namespaces — context is static, input is dynamic.

Richer example with multiple field types:

metadata:
  name: invoice-pipeline
  description: "Processes and validates invoice documents."
  input_schema:
    type: object
    required:
      - document_text
      - vendor_id
    properties:
      document_text:
        type: string
        description: "Raw invoice text to process."
      vendor_id:
        type: string
        description: "Vendor identifier for routing."
      currency:
        type: string
        enum: ["USD", "EUR", "GBP"]
        description: "Currency for amount normalization. Default: USD."
      strict_validation:
        type: boolean
        description: "Whether to apply strict validation rules. Default: false."

CLI invocation:

aegis workflow run invoice-pipeline \
  --input '{"document_text": "Invoice #1234...", "vendor_id": "acme-corp", "currency": "USD"}'

Scope: Workflows are deployed at Tenant scope by default. Use --scope user for personal workflows or --scope global for platform-wide workflows (requires operator role). After deployment, use aegis workflow promote and aegis workflow demote to change scope. Scope is not a manifest field — it is controlled via the CLI --scope flag or API parameter at deploy time.

spec

FieldTypeRequiredDescription
initial_statestringName of the first state to enter. Must exist in spec.states.
max_total_transitionsintegerMaximum total state transitions before the workflow terminates. Default: 50. Ceiling: 100.
contextmap[string, any]Global constants accessible via {{workflow.context.KEY}}.
storageobjectWorkflow-level shared volume configuration.
statesmap[string, WorkflowState]State machine definition. Must contain initial_state.

spec.storage

FieldTypeDescription
workspaceobjectDefault shared workspace. Auto-created at workflow start. Reference as {{workflow.storage.workspace}}.
workspace.storage_classephemeral | persistentVolume lifetime.
workspace.ttl_hoursintegerHours until auto-deletion (ephemeral only). Counted from workflow start.
workspace.size_limit_mbintegerMaximum volume size in MiB. Writes beyond this return ENOSPC.
shared_volumesobject[]Additional named shared volumes.
shared_volumes[].namestringUnique name. Reference as {{workflow.storage.shared_volumes.NAME}}.
shared_volumes[].storage_classephemeral | persistentVolume lifetime.
shared_volumes[].ttl_hoursintegerHours until auto-deletion (ephemeral only).
shared_volumes[].size_limit_mbintegerMaximum volume size in MiB (ephemeral only).
shared_volumes[].volume_idstringPre-created volume UUID (persistent only).

spec.states[*] — Common Fields

FieldTypeRequiredDefaultDescription
kindstringAgent | System | Human | ParallelAgents | ContainerRun | ParallelContainerRun | Subworkflow
max_state_visitsinteger5Maximum times this state can be entered before the workflow terminates. Ceiling: 20.
timeoutstring300sMaximum time in this state. Human-readable: "300s", "5m", "1h".
volumesobject[]Volume mounts for this state (Agent kind).
volumes[].volumestringVolume reference. Supports Handlebars: {{workflow.storage.workspace}}.
volumes[].mount_pathstringAbsolute path inside the container.
volumes[].access_moderead-write | read-onlyAccess mode enforced by AegisFSAL.
transitionsobject[]Ordered transition rules. Evaluated top-to-bottom; first match wins.

kind: Agent

FieldTypeRequiredDescription
agentstringDeployed agent name or UUID.
intentstringHandlebars template rendered as the per-state natural-language steering for the agent ({{intent}} variable in the agent's prompt template). Falls back to the workflow-level caller intent when absent.
inputstringHandlebars template rendered as the agent's task input. Supports {{intent}}, {{input.KEY}}, {{STATE.output}}, {{blackboard.KEY}}, {{state.feedback}}.
isolationinherit | firecracker | docker | processOverride isolation mode. Default: inherit (from node config).

When the agent completes, the Blackboard is updated under the state name:

{
  "status": "success|failed",
  "output": "<agent final text output>",
  "score": 0.91,
  "iterations": 2
}

Use {{STATE.output}}, {{STATE.status}}, or {{STATE.score}} in downstream states.

kind: System

FieldTypeRequiredDescription
commandstringShell command to run. Handlebars templates supported.
envmap[string]stringEnvironment variables. Values support Handlebars.
workdirstringWorking directory. Default: /workspace.

Output is written to the Blackboard as {state_name}.output.stdout, {state_name}.output.stderr, and {state_name}.output.exit_code.

Use condition: exit_code_zero, condition: exit_code_non_zero, or condition: exit_code with a value to branch on exit status.

kind: Human

FieldTypeRequiredDescription
promptstringMessage displayed to the operator. Handlebars templates supported.
timeoutstringDuration to wait before following timeout transitions. Omit to wait indefinitely.
default_responsestringSignal value automatically applied when timeout elapses.

The workflow suspends until signalled via the workflow execution API:

POST /v1/workflows/executions/{execution_id}/signal
Content-Type: application/json

{ "response": "approved" }

The signal payload is available as {{human.feedback}} in transition feedback strings and downstream input templates.

kind: ParallelAgents

FieldTypeRequiredDescription
agentsobject[]List of parallel agent configurations.
consensusobjectConsensus aggregation configuration.

agents[]

FieldTypeRequiredDefaultDescription
agentstringDeployed agent name or UUID.
inputstringHandlebars template for this agent's task input.
weightnumber1.0Weight for consensus calculation.
timeout_secondsinteger60Per-agent execution timeout in seconds.
poll_interval_msinteger500Polling interval in milliseconds.

consensus

FieldTypeRequiredDefaultDescription
strategystringweighted_average | majority | unanimous | best_of_n
thresholdnumber0.7Minimum score threshold (0.0–1.0).
min_agreement_confidencenumberMinimum consensus confidence required (0.0–1.0). Also accepted as agreement.
nintegerFor best_of_nNumber of top judges to average.
min_judges_requiredinteger1Minimum judges that must succeed. State fails if fewer complete.
confidence_weighting.agreement_factornumber0.7Weight for inter-judge agreement. Must sum to 1.0 with self_confidence_factor.
confidence_weighting.self_confidence_factornumber0.3Weight for judges' self-reported confidence scores.

Consensus Strategies

StrategyAlgorithmBest For
weighted_averageWeighted mean of scores; confidence penalised by inter-judge variance.General-purpose gradient validation.
majorityBinary vote (score ≥ threshold = pass). Simple majority wins, confidence = vote margin.Approve/reject decisions.
unanimousAll judges must score ≥ threshold. Confidence = minimum confidence across all judges.Critical gates (security audits, production deploys).
best_of_nRank by score × confidence; take top N; compute weighted average of those N.Reduce impact of outlier judges.

State output structure:

{
  "consensus": {
    "score": 0.92,
    "confidence": 0.85,
    "strategy": "weighted_average",
    "metadata": {}
  },
  "individual_results": [
    { "agent_id": "reviewer-v1", "score": 0.9, "confidence": 0.88, "reasoning": "..." }
  ],
  "agents": [
    { "agent_id": "reviewer-v1", "output": "{\"score\": 0.9, ...}", "weight": 1.0 }
  ]
}

Use {{STATE.consensus.score}}, {{STATE.consensus.confidence}}, and {{STATE.agents.N.output}} in downstream transitions and templates.

kind: ContainerRun

Executes a single command in an isolated container without an LLM or iteration loop. Use ContainerRun for deterministic CI/CD steps: compiling, running tests, building images, and deploying artifacts.

FieldTypeRequiredDefaultDescription
namestringHuman-readable label displayed in the Synapse UI and event log.
imagestringContainer image reference. Accepts any Docker image (e.g. "rust:1.75-alpine", "docker:24-cli") or a private registry image (e.g. "ghcr.io/myorg/builder:v2").
image_pull_policystringIfNotPresentAlways | IfNotPresent | Never.
commandstring[]Command and arguments to execute (entrypoint override). Handlebars templates supported in each element.
shellbooleanfalseIf true, join all command elements and pass them to sh -c. Use for multi-line scripts.
envmap[string]stringEnvironment variables. Values support Handlebars.
workdirstring/workspaceWorking directory inside the container.
volumesobject[]Volume mounts. Same volumes declared at workflow level or per-state.
volumes[].namestringVolume name that matches a workflow-level volume.
volumes[].mount_pathstringAbsolute path inside the container.
volumes[].read_onlybooleanfalseMount as read-only.
resourcesobjectCPU, memory, and timeout limits.
resources.cpuintegerCPU millicores (e.g. 2000 = 2 vCPU).
resources.memorystringMemory limit (e.g. "2Gi", "512Mi").
resources.timeoutstring5mHard timeout for the container. Human-readable: "5m", "1h". Activity-level timeout in Temporal.
registry_credentialsstringSecrets vault path for private registry auth (e.g. "secret:cicd/docker-registry"). Resolved by the orchestrator; never exposed to the container.
retryobjectRetry configuration for transient failures.
retry.max_attemptsinteger1Total attempts including the first. 3 = one initial attempt plus two retries.
retry.backoffstring"0s"Initial delay between retries. Subsequent retries double the backoff (exponential).

After the container exits, the Blackboard is updated under the state name:

{
  "exit_code": 0,
  "stdout": "<captured stdout, up to 1 MB>",
  "stderr": "<captured stderr, up to 1 MB>",
  "duration_ms": 4321
}

Use {{STATE.output.exit_code}}, {{STATE.output.stdout}}, and {{STATE.output.stderr}} in downstream states.

Transitions on ContainerRun states use exit_code_zero, exit_code_non_zero, exit_code, on_success, on_failure, and custom. See Condition Types below.

kind: ParallelContainerRun

Runs multiple container commands concurrently within a single workflow state. All steps start simultaneously; the state waits until they all finish before evaluating transitions. Use it for parallel test suites, multi-language builds, or concurrent quality gates.

FieldTypeRequiredDefaultDescription
stepsobject[]List of container steps to run in parallel. Each step has the same fields as kind: ContainerRun (except retry), plus a required name.
steps[].namestringUnique label for this step. Used as the Blackboard output key.
completionstringall_succeedall_succeed | any_succeed | best_effort. See below.

completion Strategies

StrategyBehaviourBest For
all_succeedState succeeds only if every step exits with code 0.CI gates where all checks must pass.
any_succeedState succeeds if at least one step exits with code 0.Multi-platform builds where one target is sufficient.
best_effortState always transitions with on_success; per-step results are available for inspection.Information-gathering steps where failure is non-blocking.

After all steps complete, the Blackboard is updated under the state name with per-step keys:

{
  "unit-tests": { "exit_code": 0, "stdout": "...", "stderr": "", "duration_ms": 12300 },
  "lint":       { "exit_code": 1, "stdout": "", "stderr": "error[E0...]", "duration_ms": 870 },
  "format-check": { "exit_code": 0, "stdout": "", "stderr": "", "duration_ms": 410 }
}

Access per-step output as {{STATE.output.STEPNAME.stdout}}, {{STATE.output.STEPNAME.stderr}}, and {{STATE.output.STEPNAME.exit_code}}.

kind: Subworkflow

Invokes another deployed workflow as a child execution. Use Subworkflow to compose workflows — extract reusable pipelines into standalone workflows and call them from a parent.

FieldTypeRequiredDefaultDescription
workflow_idstringDeployed workflow name or UUID of the child workflow to invoke.
modestringblockingblocking | fire_and_forget. Determines whether the parent waits for the child to complete.
result_keystringBlackboard key where the child's final Blackboard is stored (blocking mode only). If omitted in blocking mode, the child result is still available as {{STATE.result}}.
inputstringHandlebars template rendered and passed as the child workflow's start input (--input).

Execution Modes

ModeParent BehaviourBlackboard Output
blockingWaits for the child workflow to reach a terminal state.STATE.status, STATE.result (child's final output), STATE.child_execution_id. If result_key is set, the child's full final_blackboard is also written to blackboard.{result_key}.
fire_and_forgetStarts the child and immediately evaluates transitions.STATE.status ("success" if child was started), STATE.child_execution_id.

Blocking mode example:

RUN_PIPELINE:
  kind: Subworkflow
  workflow_id: data-pipeline-v2
  mode: blocking
  result_key: pipeline_output
  input: "{{workflow.context.task}}"
  timeout: 1800s
  transitions:
    - condition: on_success
      target: PROCESS_RESULT
    - condition: on_failure
      target: HANDLE_ERROR
      feedback: "Child workflow failed: {{RUN_PIPELINE.result}}"

Fire-and-forget example:

START_BACKGROUND_JOB:
  kind: Subworkflow
  workflow_id: nightly-cleanup
  mode: fire_and_forget
  input: |
    {"environment": "{{blackboard.deploy_env}}"}
  transitions:
    - condition: on_success
      target: CONTINUE_MAIN_FLOW

After the state completes, the Blackboard is updated under the state name:

Blocking mode:

{
  "status": "success",
  "result": "<child workflow's final output>",
  "child_execution_id": "wfx-child-a1b2c3d4-..."
}

Fire-and-forget mode:

{
  "status": "success",
  "child_execution_id": "wfx-child-a1b2c3d4-..."
}

Use {{STATE.result}} in downstream states to reference the child's output (blocking mode) and {{STATE.child_execution_id}} to reference the child execution.

Validation Rules

  • workflow_id must reference a deployed workflow. The engine validates this at state entry and fails with a descriptive error if the workflow is not found.
  • result_key is only meaningful in blocking mode; it is ignored in fire_and_forget mode.
  • Circular composition (workflow A calls B which calls A) is detected at runtime. The engine enforces a maximum child depth of 10 and fails with SubworkflowDepthExceeded if exceeded.
  • The child workflow's timeout is independent of the parent state's timeout. If the parent state times out before the child completes, the child is cancelled.

spec.states[*].transitions[]

FieldTypeRequiredDescription
conditionstringNamed condition type (see below). If absent, transition is unconditional.
targetstringState to transition to. Must exist in spec.states.
feedbackstringHandlebars string written to {{state.feedback}} for the next state.
thresholdnumberFor score/consensus conditionsScore or consensus threshold (0.0–1.0).
agreementnumberFor consensusMinimum consensus.confidence required.
valuestringFor exit_code, input_equalsExact value to match.
expressionstringFor customHandlebars boolean expression.
min / maxnumberFor score_betweenScore range bounds (inclusive).

Condition Types

ConditionApplicable KindsDescription
(absent)AllUnconditional fallback. Always matches. Place last.
alwaysAllExplicit unconditional. Equivalent to absent condition.
on_successAgent, System, ContainerRun, ParallelContainerRun, SubworkflowState succeeded (exit code 0, agent returned success, or child workflow completed).
on_failureAgent, System, ContainerRun, ParallelContainerRun, SubworkflowState failed.
exit_code_zeroSystem, ContainerRunExit code was 0.
exit_code_non_zeroSystem, ContainerRunExit code was non-zero.
exit_codeSystem, ContainerRunExit code equals value.
score_aboveAgentAgent validation score > threshold.
score_belowAgentAgent validation score < threshold.
score_betweenAgentScore between min and max (inclusive).
confidence_aboveAgentValidation confidence > threshold.
consensusParallelAgentsConsensus score ≥ threshold AND confidence ≥ agreement.
all_approvedParallelAgentsAll agents scored ≥ consensus threshold.
any_rejectedParallelAgentsAt least one agent scored < consensus threshold.
input_equalsHumanSignal payload matches value.
input_equals_yesHumanPayload is "yes", "approve", "approved", or "true".
input_equals_noHumanPayload is "no", "reject", "rejected", or "false".
customAllHandlebars expression evaluates to a truthy result.

Evaluation order: transitions are evaluated top-to-bottom. The first matching transition is taken. Always include an unconditional fallback as the last transition.

Terminal states: a state with transitions: [] is terminal. When reached, the workflow execution completes with status completed.


Field Definitions

spec.context

Global constants accessible in any state's input, command, env, prompt, expression, and feedback fields via {{workflow.context.KEY}}. Use this for values shared across states (iteration limits, thresholds, URLs). Prefer context for scalars and storage for shared file systems.

spec.storage

Volumes are created by the orchestrator at workflow start, before any state executes. States reference them by Handlebars template variable — no UUID threading between states is needed.

ClassLifecycleUse Case
ephemeralAuto-deleted after ttl_hoursBuild artifacts, scratch code, intermediate files
persistentKept until explicitly deleted via APIShared datasets, long-lived outputs, auditable artifacts

All file operations on workflow volumes are intercepted by AegisFSAL — every read and write is authorized against the agent's spec.security.filesystem policy and published as a StorageEvent domain event for the full audit trail.

Handlebars Template Variables

All input, command, env, prompt, expression, and feedback fields support Handlebars templates.

VariableAvailable InDescription
{{intent}}All statesThe caller's natural-language intent string passed at workflow start.
{{input.KEY}}All statesKey from the workflow start --input payload (validated against metadata.input_schema).
{{workflow.context.KEY}}All statesValue from spec.context.
{{workflow.storage.workspace}}volumes[].volumeReference to the shared workspace volume.
{{workflow.storage.shared_volumes.NAME}}volumes[].volumeReference to a named shared volume.
{{STATE.output}}States after STATEFinal output text of an Agent state.
{{STATE.output.stdout}}States after STATEstdout of a System or ContainerRun state.
{{STATE.output.stderr}}States after STATEstderr of a System or ContainerRun state.
{{STATE.output.exit_code}}States after STATEExit code of a System or ContainerRun state.
{{STATE.output.duration_ms}}States after STATEWall-clock duration in milliseconds (ContainerRun states).
{{STATE.output.STEPNAME.stdout}}States after STATEPer-step stdout (ParallelContainerRun states).
{{STATE.output.STEPNAME.stderr}}States after STATEPer-step stderr (ParallelContainerRun states).
{{STATE.output.STEPNAME.exit_code}}States after STATEPer-step exit code (ParallelContainerRun states).
{{STATE.status}}States after STATE"success" or "failed".
{{STATE.score}}States after STATEValidation score (Agent states, 0.0–1.0).
{{STATE.consensus.score}}States after STATEConsensus score (ParallelAgents states).
{{STATE.consensus.confidence}}States after STATEConsensus confidence (ParallelAgents states).
{{STATE.agents.N.output}}States after STATEIndividual judge full output (ParallelAgents, 0-indexed).
{{STATE.agents.N.output.FIELD}}States after STATEField from a judge's JSON output (e.g. {{AUDIT.agents.0.output.score}}, {{AUDIT.agents.1.output.reasoning}}).
{{STATE.result}}States after STATEFinal output of a Subworkflow state (blocking mode).
{{STATE.child_execution_id}}States after STATEChild workflow execution ID (Subworkflow states).
{{blackboard.KEY}}All statesAny key set in spec.context or by a prior System state.
{{state.feedback}}All statesfeedback string from the incoming transition.
{{human.feedback}}After Human statesAdditional feedback text from the signal --feedback argument.
{{execution.id}}All statesWorkflow execution UUID.

Blackboard

The Blackboard is a live string → JSON value map shared across all states in a workflow execution. It is owned and managed by the Temporal worker for the duration of the execution.

Lifecycle:

  1. Seed: Before the first state runs, the Blackboard is seeded with all spec.context constants. Callers may additionally supply --blackboard key-value pairs at invocation time; these merge in at the same moment.
  2. Mutation: As each state completes, its output is written to the Blackboard under the state name key. System states, ContainerRun states, and the update_context built-in all write structured JSON into the Blackboard.
  3. Snapshot: When the workflow completes, the final Blackboard state is captured and stored in PostgreSQL for audit and debugging.

Template access:

  • {{blackboard.KEY}} — resolves from the live Blackboard, updated after each state transition.
  • {{input.KEY}} — resolves from WorkflowExecution.input, which is immutable and not part of the Blackboard.

These two namespaces are distinct: input is what the caller provided at invocation; blackboard is the accumulated state built up during execution.

Example: A three-state workflow where each state reads from the previous:

spec:
  context:
    base_currency: "USD"
  states:
    extract:
      kind: agent
      agent: invoice-extractor
      input_template: '{"text": "{{input.document_text}}", "currency": "{{workflow.context.base_currency}}"}'
      transitions:
        - on: success
          next: validate
    validate:
      kind: agent
      agent: quality-gate
      input_template: '{"artifact": "{{blackboard.extract}}", "criteria": ["amounts-balance", "vendor-exists"]}'
      transitions:
        - on: success
          next: complete
    complete:
      kind: system
      command: "echo done"

In this example, {{input.document_text}} is the caller-supplied input (immutable); {{blackboard.extract}} is the output written by the extract state (mutable, grows with each state).

Handlebars conditionals, loops, and built-in helpers are fully supported:

input: |
  {{#if blackboard.iteration_number}}
  Attempt {{blackboard.iteration_number}} failed: {{state.feedback}}
  {{else}}
  {{intent}}
  {{/if}}

Comparison operators in custom expressions: <, >, <=, >=, ==, !=, &&, ||, !

Built-in helpers:

HelperExampleDescription
{{length array}}{{length AUDIT.agents}}Number of elements in an array.
{{upper string}}{{upper input.task}}Convert to upper case.
{{lower string}}{{lower STATE.status}}Convert to lower case.
{{trim string}}{{trim STATE.output}}Strip leading/trailing whitespace.
{{json object}}{{json AUDIT.agents.0.output}}Pretty-print an object as a JSON string.
{{first_line string}}{{first_line GENERATE.output}}Return only the first line of a multi-line string.
{{default value fallback}}{{default blackboard.lang "python"}}Return value if present and non-empty, otherwise fallback.

Missing key behavior: if a template references a key that doesn't exist in the Blackboard at render time, the engine renders a descriptive placeholder rather than crashing the workflow:

{{{{ ERROR: missing key 'GENERATE.output' — state GENERATE has not yet completed }}}}

This ensures a workflow state receives a clear signal about a misconfigured template rather than silently receiving an empty string or aborting execution.


Judge Agent Output Format

Agents used as judges in ParallelAgents states must output JSON:

{
  "score": 0.85,
  "confidence": 0.92,
  "reasoning": "The code meets requirements with minor style issues.",
  "suggestions": ["Add type hints", "Improve error handling"],
  "verdict": "pass"
}
FieldTypeRequiredDescription
scorefloat (0.0–1.0)Quality/correctness score on a continuous gradient.
confidencefloat (0.0–1.0)Judge's certainty in its assessment.
reasoningstringExplanation for the score (minimum 10 characters).
suggestionsstring[]Improvement recommendations.
verdictstringHuman-readable decision: "pass", "fail", or "warning".

For a complete judge agent manifest, see Agent Manifest Reference — Code Reviewer (Judge Agent).


Examples

100monkeys Classic

The classic iterative refinement loop: generate code, execute it, validate output with a judge agent, refine on failure.

apiVersion: 100monkeys.ai/v1
kind: Workflow

metadata:
  name: 100monkeys-classic
  version: "1.0.0"
  description: "Classic iterative refinement loop"
  labels:
    category: iterative
    pattern: generate-execute-validate-refine

spec:
  context:
    max_iterations: 10
    validation_threshold: 0.95

  initial_state: GENERATE

  states:
    GENERATE:
      kind: Agent
      agent: "{{input.agent_id}}"
      input: |
        {{#if blackboard.iteration_number}}
        Iteration {{blackboard.iteration_number}} — previous attempt failed.

        --- PREVIOUS CODE ---
        {{blackboard.previous_code}}

        --- ERRORS ---
        {{blackboard.validation_errors}}

        Fix the code.
        {{else}}
        {{intent}}
        {{/if}}
      transitions:
        - condition: always
          target: EXECUTE

    EXECUTE:
      kind: System
      command: "{{input.command}}"
      env:
        CODE: "{{GENERATE.output}}"
      transitions:
        - condition: exit_code_zero
          target: VALIDATE
        - condition: exit_code_non_zero
          target: REFINE
          feedback: "Execution failed: {{EXECUTE.output.stderr}}"

    VALIDATE:
      kind: Agent
      agent: "{{input.judge_id}}"
      input: |
        Task: {{intent}}
        Output: {{EXECUTE.output.stdout}}

        Validate and respond with JSON:
        {"valid": bool, "score": 0.0-1.0, "reasoning": "..."}
      transitions:
        - condition: score_above
          threshold: 0.95
          target: COMPLETE
        - condition: score_below
          threshold: 0.95
          target: REFINE
          feedback: "{{VALIDATE.output.reasoning}}"

    REFINE:
      kind: System
      command: update_blackboard
      env:
        ITERATION: "{{blackboard.iteration_number + 1}}"
        CODE: "{{GENERATE.output}}"
        ERRORS: "{{state.feedback}}"
      transitions:
        - condition: custom
          expression: "{{blackboard.iteration_number < workflow.context.max_iterations}}"
          target: GENERATE
        - condition: custom
          expression: "{{blackboard.iteration_number >= workflow.context.max_iterations}}"
          target: FAILED

    COMPLETE:
      kind: System
      command: echo
      env:
        RESULT: "{{EXECUTE.output.stdout}}"
      transitions: []

    FAILED:
      kind: System
      command: echo
      env:
        REASON: "Max iterations ({{workflow.context.max_iterations}}) reached"
      transitions: []

The Forge

Constitutional Development Lifecycle with human approval gates and adversarial multi-judge audit.

apiVersion: 100monkeys.ai/v1
kind: Workflow

metadata:
  name: the-forge
  version: "1.0.0"
  description: "Constitutional Development Lifecycle with adversarial testing"
  labels:
    category: development
    pattern: constitutional-lifecycle

spec:
  context:
    max_code_iterations: 5
    constitution_path: constitution.md

  initial_state: ANALYZE

  states:
    ANALYZE:
      kind: Agent
      agent: requirements-ai-v1
      input: |
        User request: {{input.user_request}}
        Codebase: {{input.codebase}}
        Generate requirements (functional, non-functional, acceptance criteria).
      transitions:
        - condition: always
          target: AWAIT_REQUIREMENTS_APPROVAL

    AWAIT_REQUIREMENTS_APPROVAL:
      kind: Human
      prompt: |
        Requirements:
        {{ANALYZE.output}}

        Approve? (yes/no)
      timeout: 3600s
      transitions:
        - condition: input_equals_yes
          target: ARCHITECT
        - condition: input_equals_no
          target: ANALYZE
          feedback: "{{human.feedback}}"

    ARCHITECT:
      kind: Agent
      agent: architect-ai-v1
      input: |
        Requirements: {{ANALYZE.output}}
        Constitution: {{workflow.context.constitution_path}}
        Design system architecture.
      transitions:
        - condition: always
          target: AWAIT_ARCH_APPROVAL

    AWAIT_ARCH_APPROVAL:
      kind: Human
      prompt: |
        Architecture:
        {{ARCHITECT.output}}

        Approve? (yes/no)
      timeout: 3600s
      transitions:
        - condition: input_equals_yes
          target: TEST
        - condition: input_equals_no
          target: ARCHITECT
          feedback: "{{human.feedback}}"

    TEST:
      kind: Agent
      agent: tester-ai-v1
      input: |
        Requirements: {{ANALYZE.output}}
        Architecture: {{ARCHITECT.output}}
        Write TDD tests (pytest).
      transitions:
        - condition: always
          target: CODE

    CODE:
      kind: Agent
      agent: coder-ai-v1
      input: |
        Tests: {{TEST.output}}
        Architecture: {{ARCHITECT.output}}
        Constitution: {{workflow.context.constitution_path}}
        Implement code to pass tests.
      transitions:
        - condition: always
          target: EXECUTE_TESTS

    EXECUTE_TESTS:
      kind: System
      command: pytest tests/ -v
      env:
        PYTHONPATH: /workspace
      transitions:
        - condition: exit_code_zero
          target: AUDIT
        - condition: exit_code_non_zero
          target: CODE
          feedback: "Tests failed: {{EXECUTE_TESTS.output.stderr}}"

    AUDIT:
      kind: ParallelAgents
      agents:
        - agent: reviewer-ai-v1
          input: "Review code quality: {{CODE.output}}"
          weight: 1.0
          timeout_seconds: 120
        - agent: critic-ai-v1
          input: "Adversarial review — try to break this code: {{CODE.output}}"
          weight: 1.5
          timeout_seconds: 90
        - agent: security-ai-v1
          input: "Security audit: {{CODE.output}}"
          weight: 2.0
          timeout_seconds: 180
      consensus:
        strategy: weighted_average
        threshold: 0.95
        min_agreement_confidence: 0.8
        min_judges_required: 2
        confidence_weighting:
          agreement_factor: 0.8
          self_confidence_factor: 0.2
      transitions:
        - condition: consensus
          threshold: 0.95
          agreement: 0.8
          target: COMPLETE
        - condition: score_below
          threshold: 0.95
          target: CODE
          feedback: |
            Audit failed (score: {{AUDIT.consensus.score}}, confidence: {{AUDIT.consensus.confidence}}):
            Reviewer: {{AUDIT.agents.0.output}}
            Critic:   {{AUDIT.agents.1.output}}
            Security: {{AUDIT.agents.2.output}}

    COMPLETE:
      kind: System
      command: finalize
      env:
        CODE: "{{CODE.output}}"
        TESTS: "{{TEST.output}}"
        AUDIT_SCORE: "{{AUDIT.consensus.score}}"
      transitions: []

Agent-Augmented CI/CD Pipeline

An agent generates code, a ContainerRun compiles it, a ParallelContainerRun runs quality checks concurrently, multi-agent review validates the result, and a final ContainerRun deploys the artifact.

apiVersion: 100monkeys.ai/v1
kind: Workflow

metadata:
  name: agent-cicd-pipeline
  version: "1.0.0"
  description: "AI-augmented delivery pipeline"
  labels:
    pattern: cicd

spec:
  initial_state: GENERATE

  storage:
    shared_volumes:
      - name: workspace
        persistent: true

  states:
    GENERATE:
      kind: Agent
      agent: coder-v2
      input: |
        {{intent}}
      volumes:
        - name: workspace
          mount_path: /workspace
      transitions:
        - condition: on_success
          target: BUILD
        - condition: on_failure
          target: FAILED

    BUILD:
      kind: ContainerRun
      name: "Compile"
      image: "rust:1.75-alpine"
      command: ["cargo", "build", "--release"]
      workdir: "/workspace"
      volumes:
        - name: workspace
          mount_path: /workspace
      resources:
        cpu: 2000
        memory: "4Gi"
        timeout: "10m"
      transitions:
        - condition: exit_code_zero
          target: TEST
        - condition: exit_code_non_zero
          target: FIX_BUILD
          feedback: "{{BUILD.output.stderr}}"

    FIX_BUILD:
      kind: Agent
      agent: coder-v2
      input: |
        Build failed — fix the errors:
        {{state.feedback}}
      volumes:
        - name: workspace
          mount_path: /workspace
      transitions:
        - condition: on_success
          target: BUILD
        - condition: on_failure
          target: FAILED

    TEST:
      kind: ParallelContainerRun
      steps:
        - name: unit-tests
          image: "rust:1.75-alpine"
          command: ["cargo", "test", "--workspace"]
          workdir: "/workspace"
          volumes:
            - name: workspace
              mount_path: /workspace
          resources:
            timeout: "5m"
        - name: lint
          image: "rust:1.75-alpine"
          command: ["cargo", "clippy", "--all-targets", "--", "-D", "warnings"]
          workdir: "/workspace"
          volumes:
            - name: workspace
              mount_path: /workspace
        - name: format-check
          image: "rust:1.75-alpine"
          command: ["cargo", "fmt", "--check"]
          workdir: "/workspace"
          volumes:
            - name: workspace
              mount_path: /workspace
      completion: all_succeed
      transitions:
        - condition: on_success
          target: REVIEW
        - condition: on_failure
          target: FIX_TESTS

    FIX_TESTS:
      kind: Agent
      agent: coder-v2
      input: |
        Tests/lint failed:
        Unit: {{TEST.output.unit-tests.stderr}}
        Lint: {{TEST.output.lint.stderr}}
        Format: {{TEST.output.format-check.stderr}}
      volumes:
        - name: workspace
          mount_path: /workspace
      transitions:
        - condition: on_success
          target: BUILD
        - condition: on_failure
          target: FAILED

    REVIEW:
      kind: ParallelAgents
      agents:
        - agent: security-auditor
          input: "Security audit: /workspace"
          weight: 1.5
          timeout_seconds: 300
        - agent: architecture-reviewer
          input: "Architecture review: /workspace"
          weight: 1.0
          timeout_seconds: 180
      consensus:
        strategy: weighted_average
        threshold: 0.8
        min_judges_required: 2
      transitions:
        - condition: consensus
          threshold: 0.8
          agreement: 0.75
          target: DEPLOY
        - condition: score_below
          threshold: 0.8
          target: FIX_REVIEW
          feedback: "{{REVIEW.consensus.score}} — {{REVIEW.agents.0.output}}"

    FIX_REVIEW:
      kind: Agent
      agent: coder-v2
      input: |
        Review feedback:
        {{state.feedback}}
      volumes:
        - name: workspace
          mount_path: /workspace
      transitions:
        - condition: on_success
          target: BUILD
        - condition: on_failure
          target: FAILED

    DEPLOY:
      kind: ContainerRun
      name: "Push to Registry"
      image: "docker:24-cli"
      command:
        - sh
        - -c
        - |
          docker build -t {{blackboard.registry}}/{{blackboard.app}}:{{blackboard.version}} /workspace
          docker push {{blackboard.registry}}/{{blackboard.app}}:{{blackboard.version}}
      shell: true
      registry_credentials: "secret:cicd/docker-registry"
      volumes:
        - name: workspace
          mount_path: /workspace
      resources:
        timeout: "15m"
      transitions:
        - condition: exit_code_zero
          target: DONE
        - condition: exit_code_non_zero
          target: FAILED

    DONE:
      kind: System
      command: "echo done"
      transitions: []

    FAILED:
      kind: System
      command: "echo failed"
      transitions: []

Security Gate (Unanimous)

All judges must independently approve before promotion to production. A single dissenter blocks the gate.

SECURITY_GATE:
  kind: ParallelAgents
  agents:
    - agent: static-analyzer
      input: "{{CODE.output}}"
      timeout_seconds: 300
    - agent: penetration-tester
      input: "{{CODE.output}}"
      timeout_seconds: 600
    - agent: security-expert
      input: "{{CODE.output}}"
      timeout_seconds: 180
  consensus:
    strategy: unanimous
    threshold: 0.95
    min_judges_required: 3
  transitions:
    - condition: score_above
      threshold: 0.94
      target: PRODUCTION
    - condition: always
      target: SECURITY_FAILED
      feedback: "Security gate failed — not all judges approved at 95%"

Cross-References

TopicSee
Writing judge agents for ParallelAgents statesAgent Manifest Reference
Step-by-step workflow construction guideBuilding Workflows
Building CI/CD pipelines with ContainerRunBuilding CI/CD Pipelines
Conceptual FSM overview and state kindsWorkflows
Gradient scoring and validation confidenceValidation Guide
Storage volumes and AegisFSALStorage Gateway

Version History

VersionDateNotes
v1.02026-02-16Initial canonical format. All four state kinds (Agent, System, Human, ParallelAgents). Named-condition transitions. Handlebars templates. Four consensus strategies.
v1.12026-03-05Added ContainerRun and ParallelContainerRun state kinds for CI/CD pipeline steps. Added completion strategies (all_succeed, any_succeed, best_effort). Extended condition types and Handlebars variables for exit-code-based routing.
v1.22026-03-25Added Subworkflow state kind for workflow-to-workflow composition. Two execution modes: blocking and fire_and_forget. Child workflows use Temporal's native child workflow support.

On this page